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FOREWORD 


Research has been considered the exclusive privilege of a special 
class of intellectuals in scientific circles. Though this may have some 
validity, yet, in human sciences, especially in social sciences, research 
is increasingly becoming a need for field workers to improve the quality 
of their work. The hegemony of research scholars needs to be 
de-mystified by rendering this discipline accessible to the uninitiated. 


Conceived with the need of this category of the uninitiated, the 
author of this work aims to make the somewhat complicated and often 
abstract debates and concepts which have characterised the area of 
research methodology in social sciences intelligible, feasible and 
operational. Hence this Primer is of great value both to the teacher and 
the taught, if at all there is a strict division between the two in reality 
and in a chronological perspective. 


This book is largely the result of the untiring efforts of Prof. 
P. Ramachandran, who in his 37 years of experience of imparting 
knowledge and skills in research methodology, both to the students of 
Tata Institute of Social Sciences, Bombay, and to several social workers 
of various government and voluntary organisations, has been constantly 
searching and researching ways and means of rendering this subject in 
4 lucid manner. Today we are fortunate to have this publication in our 
hands — a greatly felt need. It will goa long way in effectively helping 
social activists, social analysts, social workers and social science 
students. The examples cited make it even more relevant for those 
working with people. As the President of the Institute for Community 
Organisation Research, Bombay, we are proud that we could bring out 
such a useful primer. 


This book will be a landmark in the field of publications on 
Research Methodology and will help everyone to improve the quality 
of work by processing the field data in an intelligent and relevant way. 
We are very grateful to Prof. P. Ramachandran for his monumental work 
which will be testified by every user. 


Dr. Yvon Ambroise 
October 1993 President-ICOR 


The objectives of the Institute for Community Organisation 


Research, established in October 1989 and registered under the 
Societies Registration Act 1860 and The Bombay Public Trust Act 
1950 are: 


L, 


To establish, maintain und develop the Institute as a research 
and training centre. 


To undertake Research in Community Organisation as under 


a. Identify, specify and operationalise primary concepts of 
relevance to the understanding of community organisation 

b. Undertake empirical research, including secondary analysis 
of available data from different sources on issues relevant 
to community organisation including studies of organisations, 
personnel and people involved either as benefactors, or 
beneficiaries of community organisation programmes. To this 
end the centre may also undertake to develop models of 
monitoring and evaluating studies which could be utilised 
by organisations which work in the field of community 
organisation 

c. Identify and select in collaboration with other national 
organisations, major themes of relevance to the understand- 
ing and advancing of community organisation and undertake 
fundamental and applied research on each theme 


To undertake training in research and social aft¥alytical skills for 
workers in the field of community organisation 


To undertake documentation of material and disseminate 
information on community organisation for individuals and 
organisations involved in the pursuit of community organisation 


To co-operate and collaborate with other organisations — 
national and international — in such training and research 
activities as would further the understanding of community 
organisation to bring about the progress of human being 


To arrange seminars, conferences, symposia, efc., for the benefit 
of those who are interested in community organisation 


To publish books and literature on community organisation and 
thereby to educate the masses and further the cause of 
community organisation and the progress of human being 


To undertake similar other activities as may be deemed to be 
necessary to promote understanding and better professional 
practice in community organisation. 
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PREFACE 


Woman (and this term includes man as well) has always sought 
answers to questions which have risen in her mind, in response 
to what she observes around her, in response to a felt need, for 
the solution of some difficulties, for ameliorating her sufferings, 
‘n search of a better life and for a number of other reasons. Some 
of the major factors that call for appropriate, reliable, and valid 
information can be conveniently classified into two major 
categories — individual and society. The major components of 
each can be identified as: 


1. Individual focussed initiatives to develop: 
(a) critical thinking on particular social problems 
(b) an analytical mind 
(c) systematic disciplined approach to the study of problems 
(d) research mindedness 
(e) control of emotional bias and ensure objectivity. 
2. Society focussed factors are: 
(a) to obtain a better understanding of society 
(b) to understand social problems and suggest appropriate 
solutions or courses of action 
(c) to throw light on controversial topics and gather verifiable 
knowledge which would help resolve disputes 
(d) to obtain data for developing and evaluating programmes 


By trial and error, she has found satisfactory answers to some 
questions, incomplete or inadequate answers to some, and no 
answers to other questions. 


An important point to which attention may now be drawn relates 
to the answer that is found. Whether or not the answer is reliable 
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and valid would depend on a number of factors. Two of the more 
important of these factors are: 


(1) the characteristics of the person looking for the answers, 
particularly, the ability to undertake the search and efforts she 
puts into the task 

(2) the methodology for pursuing the query, especially how the 
question itself is asked and the procedure adopted to find the 
answer. 


The important consideration here is that, if another researcher 
decides to find answers to the same question, she too would be 
expected to arrive at the same answers that you obtained. But, if 
the answers found by the other researcher differ from your answers 
it is not necessary to conclude that one of you is wrong. It would 
be useful in such cases to consider whether one or more of the 
following factors could have been responsible for the observed 
differences in the two sets of results: 


(1) The questions that were researched by both of you, were not 
exactly the-same ones. For example it is one thing to ask: Are 
delinquents compared to non-delinquents, more likely to be 
atheistic? It is quite another to ask: Are atheists compared to 
religious persons, more likely to be delinquents? In the first 
instance, you are comparing the behaviour of delinquents and 
non-delinquents to find out if one group contains a larger 
proportion of religious persons. In the second question, you 
are comparing religious persons with non-religious persons to 
find out if there is a significant difference in the proportion 
of delinquents among them. 

(2) The procedures used to find the answers were not the same. 
Both might have used the same ‘tools’ for obtaining data but 
adopted different ways of eliciting the information, with one 
using an indirect procedure (questionnaire for example) and 
another using a direct approach (an interview). 

(3) The influence of critical circumstances were not taken into 
account. For example, while one was conducted in the summer 
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months (usually vacation period), the other one might have 
been conducted in the winter months. 

(4) The time interval between the studies may have been such as 
there could have been real changes in the situation. For 
example, if between the intervening period of the two studies, 
there has been legislation on what constitutes delinquency 
changes, then obviously the measurement of the delinquency 
rate or level would also be different. 


If the questions were identical, the procedures identical, and the 
influence of other factors uniform, the outcome must be identical, 
and one would say that the answer was ascertained in a systematic 
way (to permit replications), and the procedures fulfill all require- 
ments of objectivity. 


In terms of what may be called the final output, the findings must 
be objective, reliable and valid as well as useful in understanding 
society, and for utilising the findings to initiate change in society. 


Reliability refers to the fact that the same results would be 
obtained if ‘he study was conducted again. Validity refers to the 
extent to which the findings are relevant to the questions that were 
posed. Thus, the findings of a study would be considered reliable 
when they can be clarified by others, who using the same proce- 
dures and methods that the earlier researcher has used, would also 
arrive at the same or similar results. This can be assured by the 
use of systematic procedures. 


You have to take a number of steps in order to do a research 
project. What follows is not a definitive rigid prescription of such 
steps. At best, these-constitute a general guide to initiate you into 
doing survey research, and as you grow in experience, you can 
appropriately modify the procedures for undertaking your studies. 
It is important to remember that each step is interrelated to and 
finally integrated with, the other steps in the concerned stage. In 
other words, each in its turn will influence and be influenced by 
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the other. Within each ‘step’ could be some ‘sub-steps’ to guide 
you through your research, which will be discussed later. 


1, 


a3 


PLANNING 


‘1 


2 


2.1 


Problem Formulation 


L14 


Identifying the Research Problem Area 


1.1.2 Selecting the Specific Issue for Research 


ik 3 ot 


Formulating the Research Objectives 


1.1.4 Clarifying the Objectives 
Research Strategy 


1.2.1 
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Scope of Study 

1.2.1.1 Time 

1.2.1.2 Place 

1.2.1.3 Population | 
1.2.1.4 Source of Information 
Generic Strategy 

Specific Research Designs 
1.2.3.1 Methods Design 
1.2.3.2 Sample Design 
1.2.3.3 Analysis Design 
1.2.3.4 Organisation Design 


EXECUTION 

Data Collection 
2.2 Data Processing 
2.3 Data Analysis 
REPORTING 


The question of whether or not one should go through a particular 
step in a research project, will depend on the answer to another 
question: How much is already known about the phenomenon 
which is intended to be studied? And, how specific is one’s choice 
right at the outset? 


The technical terms used here are ‘generic operational definitions’ 
as consensus would have it. Remember, what you have in your 
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hands is not a ‘textbook’ in survey research. This document, 
hopefully, will help you to get the rudiments of how to move 
from the research question to its answer or answers. 


In reality, research is a search, a creative combination of known 
elements, at best in a previously unknown order. So, you have to 
cultivate and internalise a framework of perception, knowledge, 
and skills relevant to your vocation. Build up your spirit of looking 
at the puzzles of creation, acquire the ability to be hospitable to 
insight and equip yourself to bring these to fruition. 


You will always be exposed to many varied things and ideas. You 
must learn to distinguish the relevant from the irrelevant and to 
ask the right question to decide what is relevant and what is not. 


You must, however, be aware of some of the major obstacles to 
research. The universe of study is always the human being. The 
dynamic nature of human beings in contrast to material inanimate 
objects makes an accurate study of the former difficult. 


Second, objectivity is difficult in the social sciences. People being 
the subject matter of research, and usually sensitive, may not 
answer all questions put to them. Where answers are forthcoming, 
they may not always be reliable, as respondents may not give the 
actual answers but what they consider to be the correct answer. 
Similarly, some behaviour patterns cannot be easily observed. 


Third, human behaviour is complex. While we may pinpoint one 
stimulus, the responses to it may be many and varied. Similarly, 
a given response may be the result of any one of several possible 
stimuli. Under such circumstances, the study of cause and effect 
becomes difficult. 


Fourth, the time available for a study is limited. When studies 
take a long time to complete, the results may become obsolete. 
Hence, the purpose of such research as a guide to the future is 
not fulfilled. Even where studies are of a short duration, it is 
possible that factors may change because of the dynamic nature 
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of society. The problem may also change and assume new aspects 
or relegate earlier dominant aspects, to those of minor importance. 


Fifth, it is difficult to collect facts by experiment as far as human 
beings are concerned. 


Sixth, we deal with society, and the researcher herself is part of 
society. Hence, she may find it difficult to work at the data 
objectively and interpret them impartially. But careful training 
could help one to be objective to a high degree. 


Research is essentially a careful search, a studious enquiry and a 
critical investigation. Though these are all appropriate phrases, 
they miss a fundamental element, the human agent who seeks 
with devotion and passion. Research begins and ends, not with 
methods and techniques, but with the curious human being. The 
researcher is her best instrument. It is her intelligence, her emo- 
tions, her imagination, and her discipline that decide how she 
moves from QUESTION to ANSWER. The less the imagination, 
discipline, perception, sensitivity, logic and method, the poorer 
her perception of the path from Question to Answer. 


Finally, I have ‘incorporated into the Primer four examples, all 
from studies that I have conducted. For purposes of inclusion here, 
the original study designs have been somewhat modified. For the 
benefit of those who are interested in reading the original research 
reports, I give below the list of published reports of four studies: 


1. Women and Employment: Report of Pilot Studies conducted 
in Delhi and Bombay, Bombay: Tata Institute of Social 
Sciences, 1970 (Author of Bombay Report) 

2. Housing Situation in Greater Bombay, Bombay: Somaiya 
Publications Ltd., 1977 (reprint) 

3. An Attempt at Raising Consciousness 
Secunderabad: Andhra Pradesh Social Service Society, 1985 

4. Towards Integrated Human Development 
Caritas India, New Delhi, 1990. 
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CHAPTER 1 


EP AL eg eee ek ee ec 


PROBLEM 
FORMULATION 


INTRODUCTION 


In music we begin with sa re ga or do re me. In learning to read, 
you begin with a bc. These are building blocks in music and 
reading. Similarly in research, we begin with problem formulation. 
It would be uscful to mention that this is the most important stage 
in arescarch study. This stage has a key role to play in determining 
the subsequent stages and steps in research. Thus, it will be the 
‘reference point’ in deciding the general rescarch strategy as well 
as the specific designs that will be adopted to execute the study. 


A word of advice regarding problem formulation is that the 
more effort you put into this step in research, the more produc- 
tive your work will be. It cannot be emphasised enough that 
this is the most crucial part of any rescarch study. The less 
effort you put into it and more indifferent you are to problem 
formulation, your ‘fears’ will increase as you proceed, espe- 
cially after you have collected all the information which you 
have decided to gather. For, you will not know what to do with 
what you have collected. Most persons I have met have at this 
late stage, after data collection come for emergency salvage 
and rescuc operations. 
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The primary function of the problem formulation stage is to 
decide, as precisely as possible, the research question(s) to be 
studied. As the key words indicate, we have to formulate the 
‘research problem’ or rather the research question(s) for which 
we wish to find answers. This is best done through the following 
four steps: 


A. Identifying the probable issues for research 

B. Sclecting the specific research issue to be studied 
C. Formulating the objectives 

D. Clarifying the objectives 


These four steps will be explained with each of the four examples. 
Let us, without much ado, proceed with each step in detail. 


FIRST EXAMPLE: 
WOMEN AND EMPLOYMENT 


IDENTIFYING THE RESEARCH PROBLEM AREA 


We have to start with a gencral area of interest to us, t.e., the 
major topic on which we wish to research. We can begin with 
the identification of a number of areas of interest. These are: 


1. The housing problem in a metropolitan city 
2. Family planning and government policy 

3. Marriage practices in different communitics 
4. Women and employment 


Reflecting on our general reading and conversations with different 
persons, Iet us further assume that we have also thought out and 
bricfly considered a number of possible aspects of each of these 
problems and then taken a decision to concentrate on the problem 
arca of ‘women and employment’. When asked the reason for 
selecting this problem areca, we can offer one or more of the 
following as having influcnced our interest in the ficld sclected 
for study: 


(a) A general curiosity about the problem 
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(b) The time within which the study must be completed 

(c) The emphasis on the learning processes 

(d) The problem lends itself to clear demarcation and selection 
of some specific aspects 

(ce) It is timely 

(f) Relates to a practical problem 

(g) Relates to a wide population 

(h) Relates to an influential or critical population 

(i) Fills a research gap 

(j) Permits generalisation to broader principles of social interac- 
tion or general theory 

(k) Sharpens the definition of an important concept or relationship 

(1) Has many implications for a wide range of practical problems 

(m) May create or improve instruments for observing and analys- 
ing data 

(n) Provide opportunity for fruitful exploration with known tech- 
niques. 


So the research problem we have identified for study, is the 
problem of ‘women and employment’. 


SELECTING THE SPECIFIC RESEARCH ISSUE 


In the light of our reflections, talking to others, reading and other 
sources, we have now identified the following aspects of the 
probiem of women and employment as worthy of further con- 
sideration: 


1. Extent and incidence of women’s employment in terms of 
who works and who docs not work, what work women do, 
and where they work 

2. The reasons why women work or do not work 

3. Views of different segments of society regarding wome 
taking up employment 

4. Consequences of women’s employment. 
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Each of these is then further scrutinised to list its sub-aspects. For 
example, the aspect of consequences could be further sub-clas- 
sified into the economic, social, psychological, family, health, and 
so on. It should be borne in mind that each of these sub-aspects 
could be further divided into components. For example, the sub- 
aspect of consequences for the family could be further broken 
down to provide for the study of one or more of the following 
components: 
(a) Role changes in the family resulting from women’s employ- 
ment 
(b) Study of effects on upbringing of working mother’s children 
(c) Study of problems of working women with particular reference 
to their family obligations. 


The purpose in ‘delimiting’ the problem is to facilitate undertaking 
a manageable study and to develop a clear insight into the 
component. Let us now make the further assumption that after 
reviewing the various aspects and sub-aspects of the problem of 
Women and Employment, one that we have finally selected for a 
detailed study is the prevailing views on women taking up work. 


FORMULATING THE OBJECTIVES 


You will agree that the sub-aspect that we have selected does not 
tell us what exactly is being studied. So we have to be more 
specific in stating our focus of study. To highlight this focus, we 
have to enumerate the interest in the form of specific objectives 
of the study. Thus, we can say that the OBJECTIVES of the study 
are: 


1. To estimate the proportion of the population which is of the 
view that women should work, and 

2. To determine the relationship between select characteristics 
of respondents and their views on the issue. 


Reading the above closely, we find that the statement of the 
objectives is quite precise. Thus, we want to find out: 
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(a) How many are of the view that WOMEN SHOULD WORK and 
how many are against this view? 

(b) What ‘kind’ of people support the viewpoint that women 
should work and what ‘kind’ take the opposite view? 


Let me make an explanatory comment here. 


Such a precise statement clearly implies that the study will not 
go into other issues or aspects of the major problem. In other 
words, it is saying ‘this much and no more and no less’. The study 
will get answers to two questions only: first, how many are for and 
how many are against. Second, are there differences in the charac- 
teristics of those for, and those against women’s employment? 


You will immediately note that the second objective mentions the 
term ‘select characteristics’ and the ‘explanatory comment’ uses 
the word ‘kind’. Are these referring to the same ‘thing’ or are 
these regarding two different ‘things’? Actually, both refer to the 
same ‘thing’. What is this thing, we shall see shortly. 


But before that, one major doubt that may enter your mind should 
be cleared. The major doubt could be: Are the objectives all that 
specific, precise and unambiguous? The temptation would be to 
say ‘Yes, the objectives are almost self-explanatory’, but in fact 
the phrase ‘should women work’ in the first objective, and the 
term ‘select characteristics’ in the second objective secm quite 
vague. We could and should ask the question: what do we really 
mean when we say ‘should women work?’ Who are the women 
we are referring to? and what are the select characteristics that 
we have in mind? 


Yes, there are quite a few words or terms or concepts that are 
unspecified, and have to be clarified. Hence, the last of the four 
steps in problem formulation, is to ‘tic up the loose ends’. 


CLARIFYING THE OBJECTIVES 


The final step in this first stage of a rescarch study is to clarify 
the key terms or conccpts and more specifically to identify, select, 
COMMUNITY HEALTH CELB 
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and list the specific information that has to be obtained in order 
to fulfil the objectives of the study. This information is referred 


to as variables. 


Variables 


A variable is any quantity or characteristic which may possess 
numerical values or categories. Variables which have definitive 
quantitative values (eg, the values with reference to age in years 
are 1, 2, 3, 4, 5.... 34, 35, 36, 37, 38 .... 88, 89, and so on) and 
can be manipulated according to the rules of mathematics, are 
called quantitative variables. Variables which do not have numeri- 
cal values but can be gradcd (e.g., good, better, best; poor, average 
and good, and so on) are called ordinal variables. Those which 
do not conform to either of these are qualitative variables or 
attributes. 


While on the subject of defining variables, it would also be 
necessary to introduce one more dimension of the variable. We 
often need to differentiate between independent and dependent 
variables. What are these? The two terms, independent and de- 
pendent, give us a very important clue to their meaning and 
purpose. The term ‘independent’ is quite self explanatory. It can 
and will take on values independent of what the values are of any 
other variable. Thus, sex is an independent variable because its 
values (male/female) are not dependent on any other variable, say, 
age or education or of occupation or height or weight. 


To put it more concretely, the sex of a person cannot be determined 
by, or be dependent on, or influenced by, or be the result of, or 
an effect of any other characteristics of that person. Hence, we 
Say that SEX is an independent variable. Similarly, AGE is an 
independent variable. 


The reverse of the above then would be the dependent variable. 
Its values will depend upon the value that the independent variable 
ascribes to it. That is to say, the value that the dependent variable 
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takes is dependent on the value of the independent variable. To 
take an example, to say that males will work and females will 
not be employed; that women look after domestic duties but men 
will not, amounts to saying that the condition of working is 
dependent on the sex of the person. Similarly, we can say that 
men will eam more than women. So the sex of the persons 
influences their earnings. 


There are however two points that you need to know about the 
variables. No variable can be considered independent or dependent 
in absolute or permanent terms. Whether a variable is to be 
considered an independent or a dependent variable will depend 
on how you are looking at the phenomenon. The status of a 
variable as independent or dependent will be decided in the light 
of your conceptual framework. Take sex for example. We have 
just seen how it has been treated as an independent variable. 
Compare that with the following observation that ‘the larger the 
size of the community, the greater will be the proportion of males 
and the lower the proportion of females’. This statement holds 
that the sex distribution of a community is dependent on the size 
of that community (the independent variable). That means at a 
point of time and place, you are able to argue out the rationale 
for making a ‘traditionally’ independent variable a dependent 
variable. You may also recall another example that was given at 
the beginning. It was to the effect that either delinquency status 
influences religiousness or that the latter influences the former. 


So armed with this knowledge, let us get back to: our example. 
First, let us get the variables identified, selected and 
operationalised. Then we shall allot them to the independent/de- 
pendent category. 


Let us now find the probable variables and items of information. 
After using the procedure of identification and selection, we have 
arrived at the following list: 


1. Marital status 
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History of working women in the family 
_ Rural/Urban background 

Occupation 

Sex 

Education 

Income 

. Family Size 

Age 

. Family Composition 

. Should women work 


— 
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Having identified the major variables in which we could be 
interested in this study, we can review each of these and select 
just a few ‘important’ ones. The following variables are now 
selected for the study: 


1. Marital Status 2. Occupation 
3. “Sex 4. Education 
5. Income 6. Age 

7. Should women work. 


The dependent variable for this study is quite obvious. There is 
only one and that is the response to the question: Should women 
work? The independent variables are the ones listed as 1 to 6 
above. 


Defining Variables/Items of Information 


Having listed the variables that are to be used in the study, the 
next task is the clarification of the various variables. This is 
necessary because others who may want to repeat this study or 
those who wish to use the findings would like to know what we 
mean when we use certain words. The words may be vague to 
some readers or likely to have more than one meaning for others. 
These variations may lead to different interpretations. It is thus 
seen that the measurement of the phenomenon will finally depend 
on how these concepts are understood. 
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What do we really mean by clarification here? First, it means we 
have to define the variable if it is likely to have more than one 
meaning, or there is some confusion or differences in opinions as 
to what one would mean by the term. Secondly, sometimes it 
would be helpful to state what would be the different values that 
the variable will take. So Ict us take up the task of clarifying the 
various variables. But before we go into that, it would be useful 
to mention a very important point. The clarifying operational 
definitions that follow are not necessarily accepted universally, 
but could vary from one study to another depending on the focus 
of the study. But as a researcher, you must be able to justify 
whatever operational definition you use. 


Women: This term is taken to refer to the female population and 
specifically to those between 14 and 55 years of age, irrespective 
of their means of livelihood. 


Employment: This term will refer to work which is economically 
beneficial to the individual concerned. Employment as defined in 
this manner could take one or more of these forms (i.e., the 
different values): 


Full-time employment: Fruitful employment which makes it neces- 
sary for the individual concemed to work for the normal number 
of hours per day as determined by the organisation in which she 
is employed. 

Part-time employment: Employment which requires the individual 
to work in an organisation for less than three-fourths of the norma! 
number of working hours stipulated by the organisation. 


Self-employed: An individual who is gainfully employed by doing 
work without employing others or being employed by others. 


Employee: One who does work under others for wages or salary 
in cash or kind, or both. 


Age: Refers to the number of years completed on the last birthday. 
The term ‘old’ refers to all those 55 years of age or over. ‘Middle’ 
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refers to those 35 to 54 years of age, and ‘young’ to those 34 
years of age or less. 


Education: This refers to a formal training in an educational 
institution. The three classifications (or values ) of education are 
low (up to S.S.C), middle (Inter/Graduate) and high (Post 
graduate/Professional degree). 


Occupation: This refers to the means of livelihood. The various 
occupations have been classified into four major categories (i.é., 
values) reflecting their relative positions in the generally accepted 
hierarchy: 


(i) Not Working 
(ii) Workers 
(iii) Clerical staff 
(iv) Executive (managerial, supervisory, and professional) 


Income: This is the total monthly income of the family. In the 
case of respondents who are temporarily living away from their 
familics, the income will include both the respondent’s earnings 
as well as the earings of all family members. 


Anticipating Outcome 


If you wish, you can take an optional sub-step. This is to tenta- 
tively state the possible association/relationship between the 
independent and the dependent variables. These are your 
hypotheses. You may have the fecling that ‘research’ means 
making and testing of hypotheses. No doubt it could be specific 
to some particular kind of research. But some research may not 
begin with a hypothesis, yet these cannot be labeled as ‘non- 
research’. Hypotheses may emerge during the course of the study 
or even at the end of it, by way of suggesting possible association. 


If you decide to formulate hypotheses as part of the step in 
problem clarification, make sure that these are accompanied by 
appropriate arguments as to the reason for your anticipating the 
results implied in the hypotheses. In other words, you must state 
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why a hypothesis is expected to be confirmed or rejected by the 
study. For this purpose of both hypothesis formulation and its 
justification, you have to do an intensive study of available 
literature. It may also be noted that it is not always necessary to 
state the expected relationship in an affirmative form, e.g., ‘More 
men than women will report that women should not work’. One 
could also state a null hypothesis, e.g., ‘There will be no differen- 
ces between men and women in their views on whether or not 
women should work’. 


The above illustration should give you a reasonably good idea of 
how the first major stage in a research project — problem for- 
mulation — is done. It must however be noted that not all studies 
go through each and every one of the above steps that have been 
described in the foregoing example. As I had already mentioned 
in the PREFACE, the question of whether or not one should go 
through a particular step in a research project, will depend on the 
answer to another question: How much is already known about 
the phenomenon which is intended to be studied? and, how specific 
is one’s choice right at the outsct? 


You will notice that the fourth step — problem clarification — is 
the ‘longest’ of the four steps. But the bulk of the problem 
formulation stage’s time and effort has to be spent on the third 
and fourth steps. 


SECOND EXAMPLE: 
HOUSING SITUATION 


IDENTIFYING THE RESEARCH PROBLEM AREA 


Let us say your organisation is interested in the issue relating to 
housing in a metropolis. To this end you want to undertake some 
research work and invite suggestions from various sources. The 
topics that are recommended include the following: 


1. The slums in the metropolis: some facts and related solutions 
2. Housing differentials of low income groups in a city 
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Rents and subsidies for low income groups 

The Rent Act and housing production 

Housing situation in a metropolis 

Monetary and fiscal policies and investments in housing 
People’s participation in housing projects 


SELECTING THE SPECIFIC RESEARCH ISSUE 


After considering each of the above, you select topic No. 5 — 
Housing situation in a metropolis. You then proceed to the third 
step in this first stage of problem formulation. 


FORMULATING THE OBJECTIVES 


Starting from this ‘specific’ focus of the study, you develop two 
objectives: 


, 


To find out: 


(a) The number and types of dwellings in the spectrum from 
hut to bungalow 

(b) The relative facilitics and amenities, existing and 
preferred, and 

(c) Characteristics differentiating houscholds on these com- 
ponents. 


To ascertain: 


(a) The current rent paid by tenants for their dwellings and 
their capacitics to pay rent as measured by realistic hous- 
ing needs, identified by the houscholds themselves 

(b) Characteristics differentiating households on these com- 
ponents. 


CLARIFYING THE OBJECTIVES 


Now the task is to take each objective and list the variables and 
related items of information. We do this, objective by objective. 
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To do so, let us first write down the key terms in the objectives 
and against each, list the variables/items of information that we 
will need to obtain to fulfil the objectives of the study. 


1. Existing Dwellings 


a. Type of dwelling b. Number of dwellings 

c. Number of rooms d. Facilities 

e. Tenancy status f. Rent paid per month 
2. Preferred Dwelling 

a. Type b. Number of rooms 

c. Facilitics d. Tenancy status 


e. Rent payable 
3. Characteristics of Household 


a. Socio-economic status b. Household size 
c. Place of origin d. Period of stay in metropolis. 


Let us then define the key terms in the list above. You will notice 
that we have a variable named ‘socio economic status’ (SES). 
This is actually a composite variable. A composite variable is one 
which is obtained by adding together the values of a number of 
related variables. In this case, socio-economic status, a composite 
variable is arrived at by adding up the values given to education, 
occupation and income of individuals/familics. 


THIRD EXAMPLE: 
MEASURING CONSCIOUSNESS 


Now to make sure that we have a full grasp of the first stage of 
problem formulation, Ict us take one more example. However, in 
this case, we assume that you have just enough experience in the 
ficld to be able to undertake the first two steps on your own. So 
I can move directly to the third step. In fact, as you gain experience 
—knowledge and skills — you will increasingly find that you 
consciously start with the third step. However, it would be useful 
to remember that when you are stuck in the third step of problem 
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formulation you would do well to ‘take a step back — into step 
two’ to gain greater understanding of what objectives you should 


formulate. 
FORMULATING THE OBJECTIVES 


The only objective of this study in a rural community is : 


To determine the level of consciousness achieved by its adult 
education participants. 


CLARIFYING THE OBJECTIVES 


At the outset, it is necessary to first clarify the major concept 
‘consciousness’ as it is used jn adult education. Discussions with 
‘Non-Government Organisations’ (NGO) officials and reading of 
appropriate literature indicate that in its initial stages, the 
programme has the following five dimensions: 


1. Social Awareness 2. Social Functionality 
3. Analytical Skills 4. Awakening Consciousness 
5. Perception of ‘Major Actors’ 


Obviously, cach of the above is in itself a very complex sub- 
objective. So, it is most vital that these be further analysed to 
arrive at the relevant variables and related items of information. 
Hence, once again, you have to read up the relevant documents, 
discuss with knowledgeable persons and develop the list of data 
required to fulfil the study. Let us now assume that this task has 
been undertaken and the following emerge from the consultations 
with individuals and documents. 


1. Social Awareness 


(a) Knowledge about select legislations 
(i) untouchability 
(ii) land cciling. 
(b) Action-Intention with respect to these legislations 
(i) untouchability in the village 
(ii) excess landholdings in the village. 
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Social Functionality 


Knowledge and use of 
(i) Savings account 
(ii) Loans for housing 
(iii) Loans for employment 
(iv) Loans for agricultural operations 
(v) Free fertilisers and seeds 
(vi) Free medical services 
(vii) Veterinary services 
(viii) Scholarships, free hostels, free education 
(ix) Ration card. 


Analytical Skills (with respect to poverty) 


(a) Causes of poverty 

(b) Party responsible for causing poverty 
(c) Can poverty be solved? 

(d) Sources of help in solving poverty 


Awakening Consciousness (through the village problem) 


(a) The village problem 
(i) The most serious problem 
(ii) Reason problem considered most serious 
(iii) Causes of the problem 
(iv) Party responsible for the problem 
(vy) Can problem be solved? 


(b) Villagers’ participation in problem solving 
(i) Efforts made to solve problem 
(ii) Outcome of efforts made to solve problem 
(iii) Whether villagers normally get together to solve vil- 
lage problem? 
(iv) Will villagers get together to solve village problem? 


(c) Role of self in problem solving 
(i) Can the respondent solve the problem? 
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(ii) Will respondent participate with others in solving the 
problem? 


(d) Other instruments for solving problem 
(i) Luck 
(ii) Prayer 
(iii) Dissemination of information 
(iv) Education 
(v) Changing peoples attitudes 
(vi) Organising people 


(e) Perception of major actors 
(i) Oppressor 
(ii) Oppressed 
(iii) - Self-image 
(iv) Pride in belonging to the oppressed class 


FOURTH EXAMPLE: 
DIRECTORS OF COMMUNITY PROJECTS 
AND HUMAN DEVELOPMENT 


The fourth example from the field of community organisation that 
has been selected is a complex one. But for purposes of 
demonstrating the problem formulation stage, it has been presented 
in as simple a form as possible. Again, here, we will skip the first 
two steps of problem identification and selection. In fact, both 
these would be common to the first example taken from the 
community organisation field. So we can move right into the third 
step which is the formulation of objectives. 


FORMULATING THE OBJECTIVES 


Objective One :To describe the select characteristics of the 
sample of Directors of Community Projects 
(DCPs). 


Objective Two :To ascertain DCPs’ understanding of human 
development. 
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Objective Three : To determine the relationship between select 
characteristics of DCPs and their understanding 
of Human Development. 


CLARIFYING THE OBJECTIVES 


Objective One: Review of materials would reveal that the relevant 
characteristics to be studied are the respondents: 


Sex 

Age 

Time spent doing social work: full-time/part-time/occasionally 
Total number of years in social work 

Training for social work: formal/informal/nil 

Participation in community work. 


pa i En ll le Yea 


Objective Two: Reading through relevant literature including a 
recently concluded evaluation study titled Towards Integrated 
Human Development, and discussing the subject with various 
authorities, we have a predetermined list of statements that would 
help to differentiate shades of views on alternate goals and means 
of human development. So the list of statements are given below 
without further comment. 


1. To ensure that all persons have a secure and adequate 
livelihood 

2. To rehabilitate the handicapped/destitute 

3. To promote a healthy environment 

4. To promote formal education 

5. To organise vocational training for those who desire it 

6. To organise people to determine for themselves in a collective 

manner their own growth and the growth of the community 

To organise relief and rehabilitation works 

To gain an entry into a community for development work 

To gain the confidence of the people 

10. To give the impression that the organisation is doing some- 
thing for the poor 


2 eS 
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12. 
1S; 
14. 
IS. 
16. 


1 
18. 
19; 


20. 


To enable people to discover the resources within themselves 
To heighten the awareness and critical ability of people 

To provide financial and material aid to the people 

To enable people to achieve economic well-being 

To have respect for the human person 

To love everyone in need irrespective of caste, creed and 
status 

To give material help to the poor 

To use one’s skill for the betterment of the poor 

To take part in rallies, demonstrations, protests in favour of 
the poor 

To bring together people to actualise problems and become 
aware of their resources and capabilities 


Objective Three: A quick reading of this objective shows that no 
new variables enter the picture. All or some of the variables 
relating to objectives one and two will be utilised in fulfilling this 
objective. 


CHAPTER 2 


RESEARCH 
STRATEGY 


We have followed in the first chapter the process through which 
problem formulation — problem identification to clarification — 
takes place. At the end of the process, we had clear statements 
of the objectives of a study as well as detailed listing of the 
variables and related items of information to fulfil the objectives 
of the study. In sum: you now have a clear idea of WHAT you 
want to study through a research project. 


Having decided the WHAT of the study, the next stage is to plan 
the How of the study. That is to say, we have to decide what 
would be the most appropriate generic research strategy you 
should adopt in order to execute the study to find the answers 
posed by the objectives of the study. This can be done most 
expeditiously and efficiently by taking the first step in this stage, 
by defining the parameters or scope of the project. 


The satisfactory completion of this task will in tum provide vital 
indicators for working out the second step, i.é., identifying the 
generic research strategy. 


The identification of the generic research strategy will help spell it 
out in greater detail by amplifying its specific research components. 
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SCOPE OF STUDY 


The function of this step is to specify the boundaries of the study. 
The specification provides a vital link between problem formula- 
tion and determination of the research strategy, both generic and 
its specifics, of the study. The scope of a study is best elaborated 
through its four elements: 


A. Time or reference period of the study 

B. Place or location of the study 

C. Population or universe to which the study applies 

D. Source(s) from which the relevant information is to be ob- 
tained. 


We shall now consider each of these in some detail. 


TIME 


This could be the distant past, the immediate past, the present, 
the immediate future and/or the distant future. Assume that you 
are interested in studying something that happened many years 
ago. Clearly you cannot literally and personally go into the distant 
past to collect information. There may be no living person bor 
before a particular year who could furnish the information we are 
looking for. Even if there is someone who was bom in the year 
we are interested in, she may have no knowledge of what happened 
at that time. She may have been too young, or indifferent or just 
not ‘around the place’ when the event occurred. 


So we have to depend on some other means to tap the required 
information. We may have to refer to whatever is currently 
available in different forms, e.g., art, oral tradition, written 
material etc. So we are essentially talking of historical studies. 


The same problems may not arise if you are studying the same 
phenomenon in the present. There may be a number of sources 
from which you can collect your information. You may even be 
able to check and cross-check the information that you have 
collected from different sources on the same issue or phenomenon. 
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If you intend to study the same phenomenon as it would occur 
in the future, you are in a different situation. For the future has 
not yet happened. Even if you wanted, you will not get anyone 
who has seen and/or experienced that phenomenon in its future 
form. Yet you want to somehow get to know something about it. 
And there is nothing to prevent you from speculating what it could 
be in the future. 


You will notice that even as we refer to the time dimension, we 
are also referring to the source of data dimension. There is a link 
between time and source. Similarly, there is a link between time 
and place as well as between time and population. In the final 
analysis, as the time dimension varies, the generic research 
strategy also varies. 


PLACE 


The place in which, or about which, the study is to be conducted 
and the population of the place, also influence the decision on 
how to do the study. First, let us consider the place itself. This 
could be a local community, a nearby village or town or city, 
or a place that is at a distance. Put in another way, it could be 
an urban metropolitan city, a non-metropolitan city, a town, a 
village or a tribal area. It could be located in India, or in another 
country or continent. Within the stipulated place, it could be 
an institution or a group of persons who form the’subject matter 
of study. 


The further the place, the more difficult it becomes to tap the 
information you need. You may not be able to read or to speak 
the language. You may not be able to reach the place or know 
anyone else who lives there or can go there for you. Finally, 
it may be too expensive to get to that place. AH these assume 
that the place is ‘now and here’. What if the place was ‘in the 
past but no longer now’, say an ancient city like Harappa or 
Atlanta? 
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Then, we have to consider the population on which the study will 
focus. This could include all or some of those residing in the 
chosen place. It could refer to those who have predetermined 
characteristics like female or male, or youth or child, working, 
handicapped, executive, educated or illiterate, marginalised and 
so on. In other words, it would be useful to know some essential 
characteristics of the population that is being studied. 


For example, the method that you would use to get the relevant 
information may be influenced by whether the target population 
consists primarily of illiterate persons or highly educated people. 
Again, there may be restrictions on who can talk to whom in 
highly restrictive cultural settings which do not encourage, or even 
prohibit, conversations between the sexes (¢.g., a mother-in-law 
may not permit her daughter-in-law to be interviewed when the 
former is the ‘functional’ head of the household). 


Then again, if you do not know the local language, you cannot 
talk to the people there. Unless of course, they know your 
language, or you can find someone else to do the talking for you 
and then translate to you. But you notice direct, personal face-to- 
face talking is not the same as talking through someone else. You 
are already introducing some variations in your study. This may 
influence the results as well. 


The place and population dimensions are interrelated in most 
studies. You cannot study a population in a non-existent place, or 
the place minus the population, in social science research. So we 
take them together. 


Broadly speaking, as the place of the study and the population to 
be studied vary, the research strategy to be adopted to execute 
the project would also vary. 
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This refers to the source or sources to be tapped for relevant 
information that we have decided we need to fulfil the objectives 
of the study. This is the single most important dimension that 
makes up the scope of a study. No matter what other dimension 
we talk about, this dimension creeps into the discussion and almost 
dominates the whole discussion. In fact, if we can correctly decide 
the optimal source or sources from which we can get our infor- 
mation, we have, in effect, almost pinpointed the research strategy. 
This, being so, we shall explore it in some detail. 


Sometimes, the same item of information can be obtained from 
two or more sources. Either one or both the sources are tapped 
and data collected or only one of the two or more probable sources 
is tapped. The decision regarding this would depend on: 


1. The relative reliability of each source 
The quantum of information to be tapped from the source. 
The more the number of items to be tapped from one and the 
same source, the greater will be the preference for that source 
3. Accessibility to the source and the efforts required to tap the 
source. 


Of the three factors, the third is most crucial. Hence, it has been 
illustrated with an example. If the objective of the study of ‘should 
women work’ had been to find out whether there has been, from 
4 historical viewpoint, a change in views on the issue, the source 
of data would necessarily have to be books and documents to 
ascertain what was said in the years gone by. If these are not 
available, or nothing has been wrilten then you cannot pursue 
such a study. 


However, as one of the objectives for the study as reported in 
Chapter 1 is to ascertain AS OF NOW, what the views of pcople 
in a given community are on the question of whether or not women 


should work, then obviously the appropriate source of information 
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is the people themselves.. There is no other source from which 
this information can be obtained. 


If the study was intended to find out ‘how many women work in 
India, in what industries and in what occupation, and do working 
women differ from non-workers in respect of age?’, then one 
would invariably look for the relevant information in the Census 
report for the community. It would even be possible to ascertain 
whether the pattern of employment has changed over the last 
hundred years. 


Therefore, a crucial issue in selecting the appropriate strategy for 
research is to ascertain the sources of data and in the context of 
the objectives of the study. The various sources of data can usually 
be classified in several different ways. But for our purpose, it 
would suffice to classify these as Primary and Secondary. 


Primary data refers to direct, first hand observation of the 
phenomenon or collection of information directly from the ‘horse’s 
mouth’. In other words, the sources are the persons who are 
personally contacted for the relevant information, and/or asked to 
record their views in a pro forma. We shall talk about these means 
later in the next chapter. So, it suffices to say at this point, that 
it can also include first hand personal observation of events, 
actions, etc. 


Information is collected from primary sources mainly because the 
nature of data we are secking, will be available only from those 
who have this kind of information. For example, opinions, at- 
titudes, knowledge, approach and perceptions of phenomena can 
be tapped only from those who have the information — the in- 
dividuals in this case. Incidentally, the primary sources of data 
usually indicate that the best means or strategy to tap the source 
would be through what are called field studics. — 


Usually, the Secondary sources are also referred to as documentary 
sources. Secondary data are details derived directly or indirectly 
from the primary data and include the following: 
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1. Books and journals 

2. Reports of research conducted in earlier times 

3. Official reports and statistics, reports of committees and com- 
missions 

4. Records of institutions, etc. 


The immediate concern is with sources which can provide the 
necessary material for undertaking a study. A source that yields 
considerable factual information about a community are the decen- 
nial census report and district gazettecrs. They give valuable and 
voluminous information about all communities in a country. Basic 
demographic, social and economic data are given, @.g., age, SCX, 
education, occupation, uncmployment, marital status, language, 
mother tongue, housing ctc. Special reports of the Census opera- 
tion provides additional data. These are most useful for basic 
studies on socio-economic levels, and comparison of data yields 
valuable insights into many problems. 


Research reports also give data which can be reanalysed. Similarly, 
other sample surveys also yield valuable data for re-analysis and 
for undertaking studies on specific issucs. In the same category 
are annual reports of institutions. 


Documents may also be unpublished, ¢.g., like the administration 
records of hospitals, institutions for children and for women, and 
educational institutions. Social welfare agencies will have informa- 
tion on their beneficiarics. Records of individuals maintained by 
welfare agencies can be uscd. 


A point to be noted about these records is that some of these may 
be defective. For example, the information these records contain 
may be adequate and reliable for some purpose, but inadequate 
and unrcliable for another. 


Again, the data may not be representative. ‘Representative’ refers 
to whether or not the findings based on this source permit drawing 
inferences for the universe from which they were drawn. 
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GENERIC RESEARCH STRATEGY 


From a carcful reading of Section I on ‘SCOPE OF STUDY’, you 
will have noticed that each element influences and in turn is 
influenced by the other elements. But underlying these are the 
objectives of your study. 


Let us consider a few examples which would bring out quite 
clearly the importance of rescarch objectives in determining the 
gencric research strategy and in this context, the crucial role played 
by elements of the SCOPE: 


Objective 1 : To trace the history of Organisation X from its 
inception in 1950 to 1975 

Objective 2 : To list the spectrum of probable reasons for drop- 
ping out of adult education programme 

Objective 3 : To measure the extent of drop-outs from the adult 
education programme 

Objective 4 : To cstimate the level of consciousness of members 
of the community in which the adult education 
programme is being conducted 

Objective S : To compare the relative effectiveness of two types 
of approaches to reduce the drop-out rate in the 
adult education programme 

Objective 6 : To describe the process through which an adult 
leamer undergoes change as a result of becoming 
literate. 


Now, read cach of these carcfully. Do not try to ‘read more than 
there is’ into the statements. We shall not, at this Stage, go through 
the detailed process of clarifying the objectives in terms of 
variables and items of information. Right now, we are concentrat- 
ing on determining the research strategy from the objectives of 
the study, and to this end, we have to specify the scope of the 
Study. So, Iet us tackle cach of the objectives. 
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Objective 1: To trace the history of Organisation X from its inception 
in 1950 to 1975. 


We are now in the nineties. So the time period of the study is 
the ‘past’, the place is the one where Organisation X is situated, 
and secondary sources would be the most obvious (records and 
reports, minutes of mectings of its various bodies and so on). 


Given the above indicators, the optimal approach (Generic 
Strategy) for fulfilling the objectives of the study would be to 
undertake a library or desk research or use available data and do 
a historical research project. The delimiters (scope of study cle- 
ments) exclude the collection of original data, or sctting the study 
in the ‘present’ or ‘future’, and also the possibility of a field or 
empirical study. 


Objective 2: To list the spectrum of probable reasons for dropping 
out of adult education programmes. 


The key operative word which provides a vital clue to the ap- 
propriate generic research strategy is ‘LIST’. Operationally, it calls 
for the preparation of a list of probable reasons for adults dropping 
out of the adult education programme. It does not seck to find 
out ‘how many give which reason’, or ‘who gives which reason’. 
It is not.even concerned with the question of whether reasons 
have changed over time. It is limiting the assignment to listing 
out the probable reasons for dropping out of the programme. So 
you sce, the strategy you adopt, is a very flexible one. It docs not 
matter what your time period is, what the place is or even what 
the source of information is. The only definitive clement is the 
population. It is concerned with adults who have ceased to par- 
ticipate in an adult education programme in any time period, and 
in any place. It does not stipulate that it is the adult participants 
themselves who will be the source of information. It has to be 
only the time period starting from the time adult education 
programmes were introduced, and about the participants. 
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Objective 3: To measure the extent of drop-out from the adult 
education programme. 


This objective calls for an accurate count of the number of 
drop-outs as against the number of adults who registered to 
participate in the adult edu sation programme. You will notice that 
the objective docs not provide for collecting information on the 
reasons for dropping out from the programme, who have dropped 
out, when they dropped out, and so on. It just wants to find out 
what is the proportion of drop-outs. 


We are now coming across this non-specification for a second 
time in our two examples. So the question you may have in mind 
could be this: Mercly because the objectives did not stipulate the 
time and place and other specifics, does it mean that one is not 
allowed to state these at some later stage? This is indeed a very 
fundamental question we must answer before proceeding with the 
subject of gencric strategy. 


As has been mentioned carlier in this Primer, the objectives 
determine what shall be studied and what should not be studied. 
Therefore, it is advisable that all relevant information be included 
in the statement of objectives. But then, it can happen that at the 
point of time that the objectives are being formulated, you may 
forget or oversee these details and it is only later that you realise 
the shortcoming or limitations implied by the statement (assuming 
the omission was not deliberate). It is necessary that we have 
checks and counter-checks so that the purpose of the study is not 
Icft incomplete because of an error of oversight. It is preciscly 
for this reason that the step ‘scope of study’ becomes critical. 


So in the study under consideration, you are quite free to stipulate 
the details with respect to the four elements of time, place, 
population and source of data. So you can now say that you are 
covering the period April 1992 to March 1993. The place is 
Village Y in State Z. The population is the adult population that 
has registered for participation in an adult education programme. 
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The source would then be the attendance register of the training 
programme. 


Having clarified the above, the appropriate strategy to fulfil the 
objectives becomes almost self-evident. It is again secondary 
analysis or use of available data. You may argue that there is an 
alternate strategy and that is to contact all or some of those who 
had registered for the programme and find out from each of them 
whether or not they have dropped out from the programme. If 
they have dropped out, the reason for it can be ascertained. 
Granted that this could be a possible alternative, it is necessary 
to consider the pros and cons of each alternate strategy. In this 
latter case, you will have to get to the concerned-adults (they may 
not be available when you reach their place), they may not be 
willing to talk to you or to answer the questions. More important, 
consider the moncy, time and personnel resources that will have 
to be invested in the study. As against these considerations, the 
use of the attendance register will eliminate most Causes for 
non-fulfilment of the objectives of the study. 


Objective 4: To estimate the level of consciousness of the members 
of the community in which the adult education programme is 
conducted. 


Before we go further, Ict us delimit the objectives by stipulating 
its scope. The time dimension can be taken to be the present time; 
the place would be the same village Y, but the population would 
be all those in the village or community in which the adult 
education programme is conducted. The question is: Do we cover 
in our study all members of the community or only some? Do we 
include children and the mentally handicapped? What is the 
justification for sclecting only some and not all members of the 
community? 

So, we may find that the real purpose, which somehow was not 


mentioned ecarlicr in the proposal for the study, was to find out 
whether those who undergo the adult education programme, 
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compared to those who do not undergo the programmes or drop- 
out of the programme, have a higher level of consciousness. This 
clarification, while the scope of the study is being finalised, 
provides a very good opportunity to the researcher to review and 
revise his objectives. Thus, even as the study is being planned, it 
gives you Opportunitics to check and recheck your objectives. 


Now, we move to the last of the four elements, the source of data. 
It is clearly the adults in the village or community who will be 
the source for the requisite information. We have to ask them 
questions that would help measure their level of consciousness. 


Thus, the source is further specified as being ‘primary’. Primary 
data have invariably to be collected through an empirical study. 
So the strategy is to undertake a field study. And if the number 
of eligible adults is large, we may decide to take some who would 
be ‘representative of the whole’. The part which is selected is the 
sample. So the gencric approach is the sample survey. 


Objective 5: To compare the relative effectiveness of two types of 
approaches to reduce the drop-out rate in the adult education 
programme. 


The emphasis in this objective is on comparing two methods of 
intervention to find out which of the two would result in a 
significant reduction in the drop-out rate. The methods are to be 
‘applied’ to two groups of adults, each being subject to one of 
the two methods of intervention, whose drop-out rates will have 
to be compared after they are subjected to the method. So the 
methods become the ‘stimuli’ and the drop-out rate becomes the 
‘response’. Now the generic research strategy that has this type 
of an approach is referred to as the experimental design. 


I have not yet mentioned the four elements. The basic experimental 
design gets modified as the characteristic of each of the elements 
changes. But there are limits to this. 
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Objective 6: To describe the process of change that an adult learner 
undergoes as a result of becoming literate. 


This objective is quite different from the other ones. For it calls 
for a description of a process. More specifically, it requires that 
the researcher studies in great detail the discernible changes taking 
place in the life of a person who has now become literate. It 
would be difficult to anticipate the variables and items of 
information that would be needed to complete this. assignment. It 
does not mean that there is nothing to start with. On the other 
hand, the researcher will have some ideas as to what she wants 
to cover. The major aspects of the subject’s life would be included, 
e.g., economic, social, communication aspect and so on. Then the 
sub-aspects may also be predetermined, or left to be tapped as 
the data are being collected. To illustrate one aspect, the social. 
The interest would be in knowing about the person’s relationship 
within the family, interactions with outsiders, communication 
pattern, participation in public activitics and so on. And as these 
are being explored, the researcher will ‘discover’ more and more 
items on which information needs to be obtained especially 
because the primgry interest is in the proccss. 


It will be clear from this that the population to which the study 
refers is composed of the adult literate persons. But the objective 
clearly states that the study will be of one person only. Hence the 
need to be purposeful in the sclection of this person. Since the 
focus of the study is more on the process of change, it would be 
useful, nay necessary, to tap more than one source. Thus, you 
may want to know from her family members and relatives, friends 
and co-workers, relevant information about the person who is 
being studicd. In other words, any person who can throw light on 
the subject would be a potential source of information. 


A gencric research study which calls for multiple sources of data 
to be tapped and in great depth about one or just a few persons 
or institutions, is referred to as a case study. 
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The six objectives have been presented to introduce you to four 
major research strategies. These are secondary analysis, survey, 
experiment and case study. Hereafter, we shall be focussing on 
the survey research only. So we move on to the third step. 


SPECIFIC RESEARCH DESIGNS 


The generic research strategy contains within it four specific 
rescarch designs. These are: 


A. Methods Design 

B. Sample Design 

C. Analysis Design 

D. Organisational (or Operational) Design 

Since cach of these will be discussed in quite some detail in the 
subsequent chapters, I shall give omly a very bricf idea of each 
of these here. 


METHODS DESIGN 


The methods design helps to answer the question: How best can 
we obtain information from persons who are the primary source 
of our data? When designing the method of data collection, it 
would also be necessary to pay particular attention to the techni- 
ques and related tools for collecting the data. 


SAMPLE DESIGN 


The sample design answers two questions: From how many 
persons do we collect the necessary information? And, how do 
we select these persons from a large number who would otherwise 
be cligible to be selected? 


ANALYSIS DESIGN 


The analysis design is necessary for deciding how the data must 
be processed and analysed in order to find the answers to the 
questions raised through the statement of objectives. In sample 
Surveys, the general tendency is to collect voluminous data and 
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these would have to be summarized and examined to find the 
specific answers to the specific questions raised by the study. 


ORGANISATIONAL DESIGN 


The organisational design pertains to the administrative and opera- 
tional details of a study. 


Now in ‘survey research’, the method of data collection is usually 
the interview of persons; the sample is usually medium to large 
in size, the method of selecting the sample could be by the use 
of representatives or the probability procedure, the analysis is 
quantitative and statistical, and the organisation required to con- 
duct the study is medium to large scale. 


It may be noted here that since the design step of a study is 
preceded by the problem formulation and the specification of the 
scope of the study, it is necessarily based on the objectives as 
well as the scope that is laid down. That is to say, each of the 
four designs is drawn up so as to fulfil the objectives of the study 
and in conformity with the stipulated scope. Thus, it follows that 
there is no standard research strategy or specific design as such. 
The design will change as the objectives change, and as the data 
to be collected and the scope of the study changes. Hence, a 
design which may be appropriate for one study may prove inap- 
propriate or inadequate for another. 


While the strategy selected would indicate the general framework 
for the study, it is necessary to determine, within this scheme, the 
procedures to be adopted in the actual collection of primary data. 
That is to say, the methods and techniques to be applied in the 
study so as to obtain the relevant data, have to be decided upon. 


While the strategy may be to use a cross-section survey, it has to 
be decided how the information is to be collected from the 
cross-section of persons. On the one hand, it may be possible to 
contact each one of the persons and ask them a scrics of questions 
which have already been predetermined providing for no variation 
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either in the number of questions or the order in which these may 
be asked. On the other, it may be possible to hand over to each 
of the persons selected, a set of questions to be answered at the 
Icisure of the respondents and returned to the researcher. A third 
possibility would be to carry out intensive interviewing with each 
one of the respondents, with the intention of covering some arcas 
of information without actually framing the questions before hand. 


It may again be possible to independently collect information from 
official records about cach one of the respondents to the extent 
possible, and later complete the remaining qucrics by mecting 
them. Thus, there are different procedures that may be adopted 
to mect the requirements, all within the framework of a cross- 
section design study. 


Such being the case, we may now be in a position to lay down 
the criteria to be fulfilled in the designing of a study. The overall 
design should be such that it: 


1. Provides for the collection of data in a manner to produce a 
high degree of accuracy, reliability and validity 

2. Involves a minimum amount of bias and subjectivity on the 

part of both the rescarch and the respondents being contacted 

Provides the maximum financial economy 

Is the most practical approach to the study of the problem 

The component designs fit into cach other admirably 

Aids in fulfilling the objectives of the study. 


a ee 


Thus, in the final analysis, each component design influences and 
is in turn influenced by the other component design, i.e., these 
are highly interrelated. Hence, particular care should be taken to 
integrate the different designs so that, at times, they complement 
and supplement each other, 


Now that we have a pretty good idea of how to go about 
determining the general research strategy of a Study, Ict us take 
the examples that we have introduced in the first chapter. Since 
the Primer is on Survey Research, and all four examples are taken 
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to illustrate the survey design, what follows is only a stipulation 
of the scope of each study. 


FIRST EXAMPLE: 
WOMEN AND EMPLOYMENT 


Time Dimension: Our interest in this study is in the present because 
the purpose in exploring this phenomenon is to find out if the 
climate is favourable for more women to enter the labour market. 


Place: The study is undertaken in a metropolitan city. 


Population: This is a question that touches all segments of socicty. 
So we take the ‘whole population’. What do we mean by the 
whole? Not everyone from 0 to 99 years of age. It would be 
meaningless. So, let us be realistic and say that the population for 
the purpose of this study would be all adults of age 18 plus years 
(after all they can now vote in the general clections). 


Source: Since we have already decided to find out the views of 
the adult population of the city, the source has to be primary. 


SECOND EXAMPLE: 
HOUSING SITUATION 


Time: The study is in the ‘present’ and in a manner of speaking 
‘futuristic’ for it is necessary to find out not only what the present 
situation is, but also to ascertain what the ‘realistic future’ pref- 
erences would be. So the time dimension for this study is the 
present and the future. 


The place is almost sclf evident. It is a metropolis. 

The population that is being studicd is the totality of houscholds 
in the metropolis. But more specifically, it will be the heads of 
houscholds. 

The source of data will invariably be the heads of houscholds 


themsclves. It is necessary to keep in mind that a relatively large 
proportion of the population may be illiterate or barely literate 
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and this would influence how we proceed with the collection of 
relevant information. 


THIRD EXAMPLE: 
MEASURING CONSCIOUSNESS 


Here again, we have a clear picture of the scope of the study. 
The organisation has been conducting its programme of adult 
education for quite some time and now wants to find out how 
much the learners have progressed. Or more specifically, at 
PRESENT, what is their level of consciousness. 


Since the programme is being conducted in a rural area, the place 
will be the rural community. 


The population will be the adult learners who are participating in 
the programme. One may raise two questions here about the 
population. The first is: Why are we not studying those who have 
participated in past programmes and even those who have joined 
but left the programme? Second: How do we know that the 
consciousness that the people now have is the result of the 
programme and not in spite of it? 


These questions can be answered in two ways. The appropriate 
time at which these questions should have been raised would be 
when the objectives of the study are being determined. The fact 
that these questions arise when the scope is being decided and 
not earlier, shows that sufficient thought was not given at the time 
of formulating the objectives. 


Another way of looking at the situation is this. Despite the best 
efforts one may put in while formulating a rescarch problem, it 
is not unlikely that some aspect, dimension or viewpoint may not 
come to one’s mind, or be lost sight of or not even be thought 
of. But everything is not lost. This second stage of planning a 
research study provides an opportunity to us to ‘review’ what has 
been originally planned. In other words, the working on the SCOPE 
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of the study provides insights into what has been done and what 
could have been done. 


Looking at the querics from the angle of who should and should 
not be covered in this study, we have then to consciously ask the 
question: Did the NGO intend the study to cover the past and the 
present learners, and then compare these with the ‘drop-outs’ from 
the programme? Maybe the NGO did not think of this possibility. 
Hence, it becomes our duty to draw the attention of the authorities 
to this possibility. They could have one or more of the following 
responses to offer: 


1 Though the programme has been in opcration for three years, 

this is their first ‘batch’ and so the question of ‘alumni’ docs 
_ hot arise 

2 This is the first batch with which their programme has been 
satisfactorily or ‘completely’ implemented. So the study of 
the batch that had ‘completed’ the first training programme 
may be inapprepriate 

3 The drop-outs are really not of interest to us at the present 

4 We would very much like, but at a later stage after we have 
been able to develop a good tool for measuring consciousness, 
to do a more rigorous study to find out whcther the Ievel of 
consciousness attained by the participants is significantly 
higher than that which would be found in the general popula- 
tion. So for the present, Ict us do a ‘modest’ study with the 
primary (though unstated) aim of getting a ‘preliminary 
reading’ of the levels and find out whether the results are 
encouraging enough to continue. 


FOURTH EXAMPLE: 
DCPs AND HUMAN DEVELOPMENT 


The scope of the study for this project is quite clear. The time 
dimension is the ‘present’; the place is ‘All India’; the population 
is the ‘directors of community projects’; and the source of 
information is primary, the directors themselves. 
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So we have now completed the task of working out the scope of 
these studics and while doing so, cross checked the objectives as 
well. In fact, it is at this stage of determining the scope of a study 
that you can further ‘test’ out the objectives and make sure that 
what is being studicd is what you want to have studied and no 
important dimension has been Ieft out due to an oversight. 


CHAPTER 3 


METHODS OF 
DATA 
COLLECTION 


The most common method of data collection used in field studies 
is the INTERVIEW method or collection of information through 
personal contacts with individual respondents. An equally com- 
mon method of data collection that is used especially with 
educated respondents located over a wide geographical area, 
is the QUESTIONNAIRE. We shall return to this later in the Chapter. 
OBSERVATION, too, is used but more as a supplement to the 
Interview. It is particularly useful for recording physical data like 
the physical environment, conditions of physical objects, facial 
expressions (especially when the expressions ‘do not match’ the 
verbal statements) as corroborative data. However, we shall not 
be dealing with this method in this Primer. 


INTERVIEW METHOD 


Basically, the interview is a method in which two individuals are 
involved in conversation with each other, the first aiming primarily 
at getting responses from the other in the form of answers and 
explanations to the questions put by the former. Hence, the aim 
in this method is to get the feelings, opinions, knowledge, attitudes 
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and experiences of the respondents about the problems being 
investigated. The basic elements of an interview are specifics and 
objectivity which facilitate the collection of reliable data. How- 
ever, the procedures utilised in this are not standard but depend 
on the manner in which the data are to be collected. 


The main advantages of the Interview Method are: 


ee er aS we 


It has a high degree of flexibility 

It can be used with both literates and illiterates 

It provides for high returns in responses 

It can be used extensively in a wide variety of studies 
Supplementary data can also be collected 

It is thorough 

It permits exhaustive exploration and probing 

Its validity can be appraised 

It can observe reactions which enable the creation of a right 
atmosphere for interviewing. 


The primary disadvantages of the Interview Method are: 


1, 


It is a very expensive procedure, é.g., transport in widespread 
areas, difficulty in locating addresses, each person has to be 
separately interviewed 

It requires trained investigators who know the language of the 
respondents 

It requires the active co-operation and participation of respon- 
dents 

It depends on the availability of respondents as it can be done 
only at their convenience 

It introduces both interviewer and interviewee biases, con- 
scious and otherwise 

It gives an interviewee a tendency to rationalise her actions. 
This, coupled with the tendency of some respondents to 
idealise their earlier experience, tends to affect the reliability 
of data 
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7. A respondent may have memory lapses. It has been seen in 
fertility studies undertaken in India that quite often women do 
not remember how many children they conceived. This is espe- 
cially true of cases where abortions, neo-natal mortality, still 
births occur. There are quite a number of cases where persons 
do not remember their date of birth or that of their children 

8. The respondents may evade answering some types of ques- 
tions. Respondents, as a general rule, are reluctant to answer 
questions they dislike. For example, if a person is asked if 
she has a tendency to consume intoxicants, she may not be 
willing to answer truthfully 

9. The interview method presumes the competence of the respon- 
dent to answer all questions. It must be noted that not all these 
obstacles are conscious, deliberate attempts by the respondent 
to furnish unreliable data. It may be possible that the respon- 
dent is doing it unconsciously and unintentionally. 


No doubt, there are many ways and means of correcting some of 
these shortcomings (some of these are dealt with in chapter six), 
but unless the investigator is well trained, she may not be in a 
position to detect them or verify them. 


INTERVIEWING TECHNIQUES AND TOOLS 


The techniques of interviewing may be broadly classified into 
structured, semi-structured and unstructured or non-structured in- 
terviews. The term ‘structure’ refers to the extent to which the 
questions are predetermined and degree of flexibility or rigorous- 
ness in the interviewing process. 


Where the questions relating to the coverage of items of informa- 
tion of the research study are pre-constructed, put down in a given 
order and followed rigorously, the interview is said to be a 
structured one. But when flexibility is permitted in terms of what 
questions are to be asked, how they shall be asked and worded 
or what details of information are to be collected, then it is said 
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to be a non-structured technique. The semi-structured contains 
elements of the structured as well as the non-structured. Hence, 
we shall now discuss the first and the last in some detail. 


NON-STRUCTURED INTERVIEWS 


“The main feature of the non-structured interview, is the flexibility 
in its usage. In this type of interviewing, generally, the investigator 
is equipped only with a listing of the major aspects, sub-aspects 
and components pertaining to the phenomenon under study and 
so to be covered during the interview. This is usually referred to 
as an interview guide. The specific questions are not listed. 
Therefore, the predominant feature of the technique is that the 
investigator opens the subject for discussion between her and the 
respondent and then asks questions as they arise during the 
discussion. In other words, each question is based on the answer 
to a previous question. 


This implies that it is very likely that the responses or experience 
of different respondents will not be the same or similar. Hence, 
it is not possible to anticipate each answer. Therefore, the question 
also cannot be predetermined. So this technique facilitates com- 
prehensive coverage of information through supplementary 
questions to the main question through probing, seeking clarifica- 
tion etc. Thus, if the ‘respondents’ answers are too short, vague 
or misleading, the investigator may with tact, draw out the respon- 
dent to obtain more detailed answers. 


The main functions of the non-structured interview are to clarify 
the respondents’ views particularly in relation to: 


1. What she means when she uses some concepts or phrases 
2. The strength of her feelings and opinions 
3. Factors influencing her opinions 


The success of such an intricate technique of data collection 
depends mainly on the interviewer’s skill and the qualities she 
possesses. 
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It may be relevant here to briefly state the formation of the final 
Tecordings of non-structured interviews. Broadly, the detailed 
responses must be classified and appropriately written with titles 
and sub-titles. The order of these titles and sub-titles will be the 
Same as that given in the interview guide if this technique is used. 
Otherwise, depending on the data being collected, it may be put 
chronologically or in sequence of occurrence. However, the guid- 
ing principle is that it should be in the order that the coverage 
has been prepared. In the extreme case, it may be worthwhile to 
write down the responses under each sub-title in a different sheet 
of paper, with all the sheets pertaining to one respondent securely 
pinned together. A final point to remember is to write down the 
responses in the third person. 


The main advantages derived from the non-structured interview 
technique is that it is highly flexible in application. It provides 
an opportunity for the respondent to express herself on the subject 
freely and fully, at the same time, enabling the interviewer to 
check on the reliability of the responses. Furthermore, the inter- 
viewer can determine her questions and procedures depending on 
each individual situation. 


The major shortcoming of this technique is that the responses may 
be so divergent between the different responses of one respondent 
herself, that the final classification and processing of the material 
will be a very difficult task. Thus, while much interesting infor- 
mation may be collected, it is equally possible that from the 
viewpoint of the objectives of the study, these may be irrelevant. 


Another drawback is that these techniques take a lot of time and 
so involve large amounts of moncy. It would not be possible to 
interview many cases, though incidentally that is not the intention 
of the use of such techniqucs. 


Finally, the success of these techniques depend entirely on the 
co-operation and communicability of the respondent as well as 
the skills of the interviewer. 
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STRUCTURED INTERVIEWS 


The main feature of this technique is that it is a rigorous form of 
data collection, providing for specific questions, and in an extreme 
form for a limited range of responses from which the respondent 
selects an appropriate one. In the same extreme form, divergent 
responses may not always find a place. But in a modified struc- 
tured interview, questions providing for free responses are added. 
Due to the lack of flexibility, the interviewer has limited scope 
for rephrasing questions, or changing the order of questions tO — 
suit the interviewing atmosphere and respondent. Furthermore, 
verification is not always possible. Its reliability among other 
things, as in the case of even non-structured interviews, depends 
on the degree of rationalisation and idealisation by the respondent 
and. the extent of selective recording by the interviewer. This 
technique is primarily used when the researcher is interested in 
getting first hand information about the ‘respondent’s opinion, 
"attitudes, personal problems and private behaviour patterns by the 
asking of predetermined relevant questions to the respondents. 
When a set of predetermined questions is prepared in a given 
order, entailing the need to ask questions of the respondent in the 
order and manner in which it is put down, it is called an Interview 
Schedule. 


A schedule will include only those questions which are essential 
and relevant to the study. In order that this takes place, not only 
should the questions be properly framed, but the responses should 
also be appropriate to the questions. This does not imply that the 
correct answer should be obtained, for, in opinion studies and 
subjective questions, there may not be a ‘correct’ answer. The 
answer should, however, be valid and reliable, express as far 
as possible the true feelings, opinions and attitudes of the 
respondents. 


It has been seen that a schedule should be formulated keeping in 
mind the objectives of a survey, including its scope and coverage. 
Hence, not only should a schedule be formulated after specifying 
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the objectives but also after the bibliographical survey has been 
undertaken, after the decision is made on who is to be interviewed, 
and the method or methods of data collection to be utilised. This 
is being stressed here because very often, one comes across 
research ‘enthusiasts’ who are of the firm opinion that once it is 
decided to undertake a survey the first thing to do is to prepare © 
a schedule or questionnaire. It is not realized that many problems 
abound, which cannot be effectively studied with the aid of highly 
structured schedules and questionnaires, e.g., case studies pertain- 
ing to prostitutes and especially their family background, could 
be done better using at the most interview guides. No doubt, some 
aspects of the problem can be studied by the use of schedules. 


Principles in the Construction of Interview Schedules 


The general principles in the formulation of a schedule are: 


(a) The questions asked should have a bearing on the objectives 
and hypothesis of the study. That is to say, while one may 
be tempted to include interesting questions, only relevant ones 
should be framed. Relevance is determined by whether or not 
the item or items have already been listed in the first place 

(b) Only those questions which cannot be elicited by any other 
method or technique should be included in a schedule 

(c) The form should be as short as possible and take into con- . 
sideration the time: factor. Where long-winded schedules are 
used, not only will the interviewer find it difficult to retain 
the attention of the respondent, but it is also possible that the 
latter may not be co-operative 

(d) The questions included should be such as can be answered 
by the respondents. For example, there may be little point in 
asking questions-on the contents of the Constitution in a 
schedule to be canvassed with illiterate agricultural workers 

(e) Questions which are definitely known to bring in unreliable 
answers must also be avoided. To illustrate, it may be mean- 
ingless to ask a beggar her monthly income and detailed 
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expenditure, for she may not only not keep track of such 
expenditure, but may also be disinterested in it 

(f) The question included should as far as possible, be short and 
to the point. The exception to this is when schedules are used 
in ‘concealed studies’, i.e., where a conscious attempt is made 
to hide from the respondent the actual purpose of the study. 
When the respondent knows the purpose of the study, there 
is a likelihood of biased answers. 

(g) Objective, factual and simple questions should precede sub- 
jective and complex questions 

(h) Each question should include only one idea and elicit only 
one answer 
One of the questions included in a draft interview schedule was: 
If married, date of marriage. Quite a few respondents may 
have been married more than once. Hence, the question has 
to be changed to read as: — 
(i) Date of first marriage............. 
(ii) Date of second/third marriage....../..... 
Another example is the question about family size and com- 


position: 
. Adults Children below 13 
Male Female Male Female 
Earners 
Dependents 
Earning 
Non-earning 


Many an Indian family docs not consider the family as consisting 
exclusively of those residing with her in an urban area. She may 
insist that the relatives in her native village should also be 
included as her family members. So, during the pre-test, it may 
be found that many of the respondents have some family mem- 
bers living with them in the city and others in the native place. 
Since it is necessary to get this information also, each column 
is further sub-divided into City/Native place. 


(i) 


Q) 
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Questions should never be repeated in a schedule, e.g., if one 
is gathering information about the children in a family, the 
basic data about them must be obtained at the outset once and 
for all. The basic question should not be repeated in different 
sections of the schedule. 

Questions to which answers are obvious should also be 
omitted. Consider the following table as an example: 


EDUCATIONAL LEVEL OF CHILDREN 0 TO 7 YEARS 


No. of Children 
Going to School 


No. of Children 
Not Going to 
School 


Reasons for Not 
Going to School 


Age 


obviously too young 
which one 


It will be obvious that children between the 0-2 age group do 
not attend school (or do they?). Further, 3-7 covers both 
school-going and non-school-going ages. Hence, they should 
not have been grouped into one category. 


(k) Leading questions should never be asked, e.g., ‘Do you think 


the accident to the child occurred because you were away on 

your job and could not adequately look after the child?’ This 

question is defective in two ways: 

(i) It may invite the respondent to answer with a ‘Yes’, for 
no mother would like to feel that such a thing would have 
occurred if she were around the place, and 

(ii) Again, since the appeal is to ‘could not adequately look 
after,’ she may say ‘Yes’ again because she may think that 
she alone could look after her child. A second example is 
that of asking the individual some questions concerning her 
health, i.e., type of illness, duration, frequency, and finally, 
‘Was not employment the cause?’ Or again ‘Do you feel 
that the nature of your job will require some more sick 
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leave?’ And ‘Has not the cost of production of bidis 
increased since the enforcement of the minimum wages?’ 
(1) When questions are followed with alternatives provided in the 
form itself, care must be taken to see that all possible alter- 
natives are provided. For example, ‘marital status: married/ 
unmarried’ does not provide for ‘widow’ and ‘widowers’. 
Again the question, ‘Are you satisfied with your job? 
(Yes/No)’, does not provide for the other shades of opinions 
like ‘Indifferent’ and ‘partly satisfied’ etc. The question: ‘Do 
women get jobs for which they are trained? Yes /No’ will 
also not permit the alternatives of ‘Some’, ‘Most’, etc. 
Another question which needed modification after some use 
was: “Under what circumstances would you be prepared to 
leave your present job? Wage rise/employer preference/in- 
dustry preference/occupational preference’. It was observed 
that the responses were more than one, and in many cases the 
responses were not anticipated. Hence, it was reconstructed 
to read as: 
For which of the following preferences would you be prepared 
to leave your present job and take up another. State your order 
of choice. 


Economic: 
(i) Wage rise 
(ii) Opportunity for advancement 
(iii) Security of job 
(iv) Benefits such as medical, housing etc. 
(v) Opportunity to learn a new trade 
(vi) Reputation of employing firm. 


Sociological and Psychological: 

(i) Work of your liking 

(ii) Work with higher prestige 
(ili) Comfortable working conditions 
(iv) Good treatment by boss 

(v) Work located in region of your choice. 
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In a study on absenteeism, questions in the draft were: 
(i) Do you own or are you a partner in any business. Y/N 
(ii) If yes, what are your duties in this respect? 
(iii) Who attends to emergency situations? 
It was found that none of the respondents had any business. 
But quite a few had agricultural land. Hence, the above 
questions were dropped and questions pertaining to agricul- 
tural land included. Two other questions were asked: 
(i) In which shift do you work? 
(ii) Do you like overtime work? 
Since shifts were by rotation and not a matter of choice, and 
overtime work was not provided for in the mills, these ques- 
tions were dropped. 
The question, ‘Do you get sufficient time to transact personal 
business after work hours?’ was also dropped because all said 
‘yes’. Then again the question, “How does your supervisor go 
about his work?’ had to be dropped since it was too ambiguous. 


(m) Contradictory statements should not be provided for in the 
same question, giving room for a further complication of a 
dichotomous answer. For example, ‘Are the cloakrooms 
provided by the management common for all grades or status? 
Y/N?’ It is obvious that a ‘Yes’ to one may imply a ‘No’ to 
the other. Or again: ‘Do the cloakrooms become overcrowded 
during the recess or any other hour? Very much/not very 
much/ never.’ 


Another example is: 


Health of the member of the family: good/fair/poor 
Adults good/fair/poor 
Children (5-15) good/fair/poor 
Infants (below 5) good/fair/poor 


In the first place, the adjectives are subjective and in the 
second place, what if the health of one adult is good and that 
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of another is only fair and that of the third is poor? An average 
cannot be obtained. 


(n) The question should be logically arranged. Consider the fol- 
lowing: 
la. What are the handicrafts that you are learning? 
1b. In what other activities do you participate? 
2. Do you have any choice? 
Does the agency pay for your work? 
Which do you like the best? 
Which do you like the least? 
How long have you been a member here? 
Have you any suggestions to make? 
att from the fact that questions 1,2,3, and 4 do not clarify 
the particular subject to which they refer, these questions-seem 
to be in some haphazard form. The questions must flow from 
the general to the particular. For example: Do you like the 
radio programmes? What do you like best among the radio 
programmes? The second question should have been asked 
first. 
Coming to the example of rearrangement of questions in order 
to obtain a logical flow of answers, a draft interview schedule 
contained questions on employment data in tabular form in 
the following order: 
Job number 
Designation 
Employer 
Industry 
Place and job 
Sources of information about job 
Shift classification 
Vocational and technical training for job 
Employment period — joined — left 
10. Consolidated salary — joining and leaving 
11. Reasons for joining the job 


NAW wW 
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12. Reasons for leaving the job 
13. Do you consider this job an improvement over the last job? 


The questions were rearranged in the following order (question 
numbers are given) 1, 2 (here the heading was changed to nature 
of work), 8, 3, 4, 5, 6, 11, 18 (this was reworded as the difference 
between consecutive jobs + and —) 9, 10, 12, 7. 


Types of Questions 


(a) Close ended Questions: A close ended question is one in which 
the responses are limited to one or more of the alternative 
answers provided along with the question. Hence, in a closed 
question, the questions are given and the answers are sug- 
gested. Examples are: 

(i) Dichotomous questions: Sometimes a question asked calls 
for only one or two possible answers, e.g., Sex. Male/ 
Female. Do you know how to read English? Yes/No. Such 
questions where a response has to be between two alter- 
natives, are called dichotomous questions. 

(ii) Multiple choice questions: This kind of question differs 
from the first one, in that it provides more than two 
alternatives. é.g., 

Marital Status: Married/Unmarried/Divorced/ 
Widowed/Separated. 

Are you satisfied with our vocation? Satisfied/Indif- 

ferent/dissatisfied. 

(b) Open ended questions: Such questions do not provide the 
respondent with possible. She is free to make her own respon- 
ses which may be one word answers or statements running 
into a few lines. 

For example: ‘What is your opinion about the present 
education system?’ 


A third type is the combination of (a) and (b). This is when the 

researcher can put down most of the alternatives as in a closed 

question but yet is aware that there may be some alternatives of 
OMMUNITY HEALTH CELB 


Re jas 228, V Main, 1 Block 


52 Survey Research For Social Work 


which she is not aware and she leaves a last alternative: ‘others 
(specify)’. 
Components of an Interview Schedule 


Once the questions to be included in the interview schedule are 
determined, the next stage is to construct the schedule itself. The 
schedule consist of four components viz, the heading or title, the 
identifying and background data, the body and the interviewer’s 
observation and remarks. Let us consider each in detail. 


Heading or Title: Every schedule must have a heading. This 
consists of the following: 


(a) The agency conducting the study, including the address 
(b) The title of the study 

(c) The classification and serial number 

(d) The type of respondent to whom the interview applies. 


Identifying and Background Data: This would include (a) such 
questions as would identify the respondent by name and address. 
(b) such questions which would form the basis for further analysis 
of the answers to questions pertaining to the study proper, é.g., 
age, sex, religion, family size, and composition. Only those ques- 
tions, the answers to which would be utilised in classifying the 
respondents into exclusive groups, would be included here. 


The Body: This would contain the questions on the problem under 
study. The manner and form in which they should be asked and 
arranged has been discussed earlier. 


The Interviewer’s Observation: This consists of: 


(a) The remarks or observations including the observations of the 
interviewer in respect of the reliability of data, the setting, the 
reactions of the respondents, and additional remarks of the 
respondents which have not been specifically asked for in the 
‘Body’ and yet are by-products of the answers to the questions 
asked. 
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(b) The place, time and duration of interview and signature of the 
interviewer. 


Finally, every question should be serially numbered. Sub-questions 
should have sub-numbering like 5, 5a, etc. 


Pre-Test 


Once the questions have been formulated, the next step is to 
pre-test the schedule in order to determine that the questions are: 


(a) appropriate to the study 

(b) clearly worded without any ambiguity of meaning 
(c) properly arranged in logical sequence 

(d) not subject to poor responses 

(e) not subject to inconsistent or irrelevant answers 
(f) adequate. 


In case it is found that some additional questions have to be asked 
they may be included in the final form. 


Finalisation 


Once the pre-test is completed, the schedule has to be formulated. 
Where the changes to be made are drastic, it is necessary to 
pre-test the re-drafted schedule also. When the final schedule is 
ready, it is necessary to prepare the ‘Instruction for Investigators’ 
to enable them to fill in the form correctly. The instructions contain 
such information as definitions of terms, meaning and scope of 
some questions, coding of answers, etc. 


The advantages and disadvantages in the use of the interview 
schedule are generally similar to those mentioned in connection 
with interviews. 


QUESTIONNAIRE METHOD 


At the very outset, it would be worthwhile differentiating between 
the schedule and the questionnaire. While these are often used 
synonymously, it is necessary to bear in mind that though the 
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general structure and composition of both of these forms would 
be more or less the same, yet in detail, they vary and the research 
design will be greatly modified according to whether one or the 
other is utilised. 


STRUCTURE 


The basic difference between the schedule and the questionnaire 
is that while the former is canvassed, the latter is filled in. When 
a schedule is being used for a sample survey, the interviewer asks 
the questions to the respondent and the latter’s replies are written 
down by the former against the appropriate questions. In the case 
of questionnaires, the investigator hands over the form to the 
respondent requesting the latter to go through it and fill it. Once 
this basic difference between the two approaches is understood, 
the student will realise that other fundamental differences emanate. 
Briefly, these are: 


(1) The structure of the questions would more likely differ be- 
tween the two forms. For, while in the case of schedules the 
interviewer will be able to reframe the questions in a manner 
that will be understood by the respondent, in the case of 
questionnaires, one has to depend on the respondent’s inter- 
pretation of each question. And in the case of some questions, 
there is every likelihood that no two respondents will interpret 
the same question identically. 

(2) The schedule is invariably used with illiterates, whereas the 
questionnaire can be used only with literate respondents. 

(3) The schedule is fundamentally a form for interviewing and 
under the appropriate circumstances, can cover more topics 
than one when a person is asked to fill the form herself, i.e., 
the time element and the corresponding length of the form 
will differ in each case. 

(4) There is a likelihood of having a higher non-response to 
questionnaires than with schedules. Accordingly, the sample 
size of the study will differ materially. This is all the more 
true of mailed questionnaires. 
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(5) The cost of a study using schedules as a form of data collection 
will be higher than that using questionnaires and correspondingly, 
the former will involve a longer time element than the latter. 


The above are the major differences between the two forms of 
data collection. As already stated, while they differ materially and 
involve other considerations, in the matter of research design, the 
basic principles involved in the construction of these forms are 
about the same. The structured forms are objective. 


QUESTIONNAIRE CONSTRUCTION 


In addition to the rules which apply to the construction of a 
schedule, the following points should be kept in mind when a 
questionnaire is prepared. 


(1) The questionnaire should contain within it all the necessary 
explanations, instructions, and clarifications regarding the 
definition of some terms. 

(2) The questions in a questionnaire should not be ambiguous and 
lend themselves to different interpretations. For example, 
‘What do you think about the present government policy?’ 
This does not explain which particular policy is referred to, 
nor to which government. 

(3) As far as possible, the questions should be few and not call 
for elaborate answers from the respondent. Long question- 
naires invariably tend to be ignored by respondents. 

(4) Questions should be worded in as simple a language as is 
possible. 


ADVANTAGES AND DISADVANTAGES OF QUESTIONNAIRE 
METHOD 


Advantages: 


(a) The questionnaire method is the less expensive procedure 
(b) Little skill is required to administer it 

(c) It can be self administered 

(d) It can be administered to a large number of people 

(e) It can be sent through mail 
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(f) It ensures uniformity from one measurement situation to 


another 
(g) No pressure of time on respondent is required. 


Disadvantages: 


(a) The information obtained is limited to the written responses 
of subject to pre-arranged questions 

(b) It is too narrow in scope as only limited questions are asked 

(c) The validity of responses cannot be assessed 

(d) It does not permit deep probing 

(e) It cannot be used with illiterates 

(f) The sample may be incomplete. 


Having completed our ‘theoretical’ understanding of the interview 
method, let us work out the general strategy and the first specific 
research design — the method of data collection — for each of 
the four examples. So far as the general strategy is concerned, it 
would be quite clear that we are required to have the survey 
research strategy. Apart from the fact that this is the obvious one 
since we have included these examples in this document, the more 
Serious point is that it has been argued to be so. Oh yes, we did 
not as yet show how we derived it, we only assumed it. So let 
us work this out. 


FIRST EXAMPLE: 
WOMEN AND EMPLOYMENT 


First, the objectives demand an ESTIMATE, which means that we 
need to have fairly accurate information on the number and 
percentage of adults who would be in favour of women working. 
Secondly, the objectives also require that we find out who favours 
and who disfavours women taking up employment. Since the kind 
of characteristics we are interested in respect of employment is 
not available to us, it has to be constructed from the field. Thirdly, 
estimates invariably require that samples be taken when the total 
population to which the question is addressed is extremely large 
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and it would be cheaper, and may be more accurate to get the 
same from a sample. For all these reasons, we resort to a sample 
survey. 


Now, coming to the specific designs we have already decided on 
the sample. So the question is about method of data collection. 
Since a large number (even if it is only a sample) has to be 
interviewed, it is advisable that we use a standardised approach 
and so an interview schedule. 


So without much ado, let us get the draft interview schedule ready. 
You will notice shortly that if the exercise of listing the vari- 
ables/items of information has been carefully done and in great 
detail, the task of drafting the interview schedule is a very simple 
and quick job. See for yourself below. On the left is the list of 
items of information that have to be collected. On the right is the » 
list of corresponding questions that go to fotm the interview 
schedule. Of course, the schedule itself is not ready, for it has to 
be improved upon in respect of its format. That is of course for 
later. So first, the frame or flesh. 


Item of Information Questions 

1. Age What is your age? 

730 Sex. Sex (we don’t ask the question 
just record it). 

3. Marital status Are you married or single? 

4. Education What is the highest class you have 
attended? 

5. Occupation What is your occupation? 

6. Income What is your family income per 
month? 

7. Should women work? = In your view, should women get 
employed? 


Well, your draft intervicw schedule is ready. But have a second 
look at it. One way to critically review it would be for you to 
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respond to the questions as though you were being interviewed. 
Or you can have someone else in mind or you could even ‘role 


play’ a respondent. 


Respondents may have no difficulty in responding to the first 
question. The second is no problem for you as an investigator. 
But take the third question. The respondent may tell you ‘‘I am 
neither single nor married’’. Possible? Why not? It depends on 
how a person looks at one’s status. For example, one who is 
separated from her spouse is temporarily neither single nor mar- 
ried. So we better modify that question before it becomes an 
embarassment to both sides. Therefore, we will now revise and 
reword the question which will now read as: What is your marital 
status? 


In order to simplify our task later, we can now list out all probable 
responses. The appropriate answer can then be ticked during the 
interview. So the question now reads: 


Q3. What is your marital status? 
Unmarried/married/separated/divorced/widowed 


Incidentally, you have a choice as to how you will put the question 
to the respondent. You could just ask the question, and leave it 
to the respondent to give an answer (in which case you are treating 
this as an open question), and you tick off the response given to 
you. Alternately, you could ask the question and also read out the 
alternate responses. 


Now, look at question 4. How do we account for illiterate intervie- 
wees? For they may just say ‘not gone to school’ (they could still 
be literate). So we simplif y this question also and ask the following 
question and provide alternatives: 


Q4. What is your education level? 
Illiterate/literate/primary/middle/hi gh school/college/graduate/ 
post graduate/professional/technical. 
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Question 5 has a similar problem. Some may be not employed, 
some unemployed and others may be in employment. Therefore, 
we can ask: | 


Q5. Are you working? 
Yes, employed 4s ....-. 
No but I am looking for a job 
No and I am not looking for a job as I am...... 


If you so desire, you could even fiH in the dots with appropriate 
answers as in the case of Q3 and Q4. 


Coming to question 6, quite a few people are reluctant to give 
their income. Hence, the procedure usually followed is to give 
them a series of income groups/categories and ask them which 
category they would put themselves in. This of course depends 
on the spread of each income group. It could be in 100s, 250s, 
or 500s, in the initial stages, and in 1000s, or more at the higher 
levels. But it is usual to fit the blocks to reflect the optional very 
low, low, moderate or middle, upper middle, and upper income 
categories. 


The seventh question is just what it says. However, we are often 
curious to know why some people are in favour and others are 
not. So we may want to add a supplementary question as follows: 


Q8. What are your reasons for this view? 


Well, the draft interview schedule is now ready for pre-testing. 
But before that, it would be useful to make a few observations 
which are very relevant to the construction of an interview 
schedule. 


First, to the extent possible, try and provide alternate responses 
as well. That is to say, let your questions be fully ‘closed’ or at 
least ‘partially’ closed. This increases the control over the data 
that you will collect. You can always provide for a final open 
category labeled “Any Other? Please Specify’. It may not always 
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be possible to do this initially. But the pre-test, when properly 
conducted, will yield a lot of useful data. 


The purpose of the pre-test is to determine that the questions are: 


(a) appropriate to the study 

(b) clearly worded with no ambiguity of meaning 

(c) properly arranged in a logical sequence 

(d) not subject to inadequate, irrelevant or inconsistent answers 

(e) no unnecessary questions have been included or necessary 
questions excluded. 


Once the pre-test is completed, the interview schedule has to be 
revised. If the changes are many and varied, it would be advisable 
to pre-test the revised schedule also. 


The interview schedule as revised has to be finally checked against 
the objectives to make sure that no objective is left incompletely 
covered (i.e., some components of it are not fully covered). 


The final step then is to prepare the final copy for printing etc. 


Once the final interview schedule has been prepared, it would be 
advisable to prepare a ‘manual of instructions for investigators’ 
explaining each question and how the schedule is to be filled in 
correctly. The manual would also contain instructions on how the 
interviews are to be conducted and also how appropriate to 
investigators. More of this is covered later in Chapter 6. 


SECOND EXAMPLE: 
HOUSING SITUATION 


Here too, we list in the first column the items of information 
required to fulfil the objectives of the study and in the second 
column the question form of each item of information. You will 
notice that some of the ‘questions’ are not ‘questions’ as such to 
be asked of the respondent but those which can be ‘observed’ and 
the appropriate ‘answer’ recorded. An example is that of SEX. 
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Another is TYPE OF DWELLING (hut, chawl, flat, bungalow). Hence, 
such questions will be put down in CAPITALS. 


Items of information 


} 


Existing Dwelling 


Type of dwelling 
Number of 
dwellings 
Number of 
rooms 

Facilities 


Tenancy status 


Rent paid per 
month 


Preferred Dwell- 
ing 

Type 

Number of 
rooms 

Facilities 


Tenancy status 
Rent payable 


Questions 


Hut/Chawl/Flat/Bungalow 

(number of dwellings of each type will be 
calculated after the interviews) 

What is the number of rooms in this dwell- 
ing? 

Does the dwelling have the following? : 
Kitchen, Verandah, Bathroom, Balcony, 
Lavatory, Electricity, Water supply. 
Tenant, Sub-tenant, Caretaker, Licensee, 
Paying guest, Free tenant, Sharing tenant, 
Owner. 

How much rent/compensation/fee do you 
pay per month? 


Given a choice, what type of accommoda- 
tion would you need for your family? : 
Hut/Chawl/Flat/Bungalow 

How many rooms should this dwelling 
have? 

What are the facilities it should have? : 
Separate kitchen Bathroom: self con- 
tained/in same building 

Lavatory: self contained /in same building 
Verandah: self contained/common 
Balcony: self contained/common 
Electricity 

Water supply: self contained/ in building 
Tenant/Owner/Any other 

What is the maximum rent per month that 
you can pay for this preferred accommoda- 
tion? 
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3. Characteristics of household 


Socio-economic 
status 


Household size 
Place of origin 


Period of stay 
in metro 


(computed from answer to following 3 
questions) What is your education level? 
Illiterate/Literate/studied upto... What is 
your occupation? 

What is your household income per 
month? 

How many members are there in your 
household? 

From which state have you/your elders 
come to this city? 

How many years have you been living in 
this city? 


THIRD EXAMPLE: 


MEASURING CONSCIOUSNESS 


The questions related to each of the items can now be written 
down with very little effort. 


k. 


Social Awareness 


(a) Knowledge: 
Untoucha- 
bility 


Land ceil- | 
ing 


What do you know about the problem of 
untouchability? 

Is there a law against untouchability? 

Do you know if the government has done 
anything about excess land holdings? 


(b) Action-Intention: 


Untoucha- 
bility 


Excess 
landholdings 


What would you do if you came to know 
that a scheduled caste person has been 
prevented from drawing water from the 
community well? 

What would you do if you came to know 
that a person has more land than allowed? 


Social Functionality 


Knowledge of 


: Do you know about/entitled to have : 
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Savings account —: Savings account in banks/post office 
Loans for 
Housing : housing 
Employ- : employment 
ment 
Agricultural: agricultural operations 
operations 
Free: 
Fertilisers/seeds __: Free fertilisers/seeds 


Medical services : medical services 
Veterinary services: veterinary services 


Hostels : hostel facilities 
Education : education 
Scholarship : scholarships 
Ration card : ration card 


3. Analytical Skills 

Causes of poverty What according to you, are the causes of 
poverty? ; 

Party responsible Who do you think is responsible for it? 
In what way is this party responsible? 

Can be solved Can the problem be solved? 
What is the solution? 

Sources of help Who can help in alleviating poverty? 


The items below are self-explanatory and most of these are already 
in the form of ‘questions’. So all that you have to do is repeat 
them in the right hand column to complete the assignment. In a 
few instances you may have to change the form into a question. 
For example, the first item ‘the most serious problem’ will now 
become ‘What in your opinion is the most serious problem of the 
village?’ So, I leave this task to you. 


4. Awakening Consciousness 


A. The village problem: 
(i) The most serious problem 
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(ii) Reason problem considered most serious 
(iii) Causes of the problem ; 

(iv) Party responsible for the problem 

(v) Can problem be solved? 


B. Villagers’ participation in problem solving: 
(i) Efforts made to solve problem 
(ii) Outcome of efforts made to solve problem 
(iii) Whether villagers normally get together to solve vil- 
lage problem? 
(iv) Will villagers get together to solve village problem? 


C. Role of self in problem solving: 
(i) Can the respondent solve the problem? 
(ii) Will respondent participate with others in solving the 
problem? 
D. Other instruments for solving problem: 
(i) Luck 
(ii) Prayer 
(iii) Dissemination of information 
(iv) Education 
(v) Changing people’s attitudes 
(vi) Organising people. 
E. Perception of major actors. 
(i) Oppressor 
(ii) Oppressed 
(iii) Self-image 
(iv) Pride in belonging to the oppressed class. 


FOURTH EXAMPLE: 
DCPs AND HUMAN DEVELOPMENT 


This exercise can be completed without any preliminary com- 
ments. In fact, you can complete the exercise yourself. 


ec SOR 
2. Age 


DA eS 
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Time spent doing social work: full time/part time/occasionally 
Total number of years in social work 

Training for social work: formal/informal/nil 

Participation in community work. 


You may need some guidance to introduce the statements that 
help measure the understanding of human development. This 
battery should be introduced with the following statement by you 
to them: 


Would you say that Human Development is : 


WE 


To ensure that all persons have a secure and adequate 
livelihood? 

To rehabilitate the handicapped/destitute? 

To promote a healthy environment? 


. To promote formal education? 
. To organise vocational/skill training to all those who desire it? 
. To organise people to determine for themselves in a collective 


manner their own growth and the growth of the community? 


. To organise relief and rehabilitation works? 

. To gain an entry into a community for development work? 

. To gain the confidence of the people? 

. To give the impression that the organization is doing some- 


thing for the poor? 
To enable people discover the resources within themselves? 


_ To heighten the awareness and critical ability of people? 

_ To provide financial and material aid to the people? 

_To enable people to achieve economic well-being? 

_ To have respect for the human person? 

_To love everyone in need irrespective of caste, creed and 


status? 


. To give material help to the poor? 
To use one’s skill for the betterment of the poor? 
._ To take part in rallics, demonstrations, protests in favour of 


the poor? 
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26. To bring together people to actualise problems and become 
aware of their resources and capabilities? 


As a general observation which would apply to all four examples, 
remember that the final interview schedule should have the fol- 


lowing at the top of it: 


Name and address of the organisation doing the study. 
The title of the study 

The respondent group to which it is to be administered 
Serial No. of the interview schedule. 


Additionally, it is permissible to write alternate questions in the 
right hand margin. Later, in step three, it would be necessary to 
check each of these and decide which would be the most ap- 
propriate one to be retained. 


In step three, scrutinise the questions listed in the right hand 
column, and check if any additional questions need to be added 
to ‘complete the idea’ implied in each of the items in the left 
hand column. For, sometimes a word may contain more than one 
idea or a composite of ideas and so has to be sorted out. This is 
not always readily discerned when the list is being prepared. So, 
when the item is being converted into a question, the inadequacy 
of a single question to reflect the item is evident. This has to be 
‘made up for’ by increasing the questions. The fourth step is to 
check each of the questions and determine whether the language 
used fully expresses in clear and unambiguous words, the idea to 
be conveyed to the respondent. 


Now, translate the draft interview schedule into the language or 
languages in which the schedule is to be canvassed with. respon- 
dents. This translated schedule has to be tested out (referred to 
as pre-tested) for use. The tool for data collection is now ready. 


It may be necessary and useful to resort to the tapping of informa- 
tion that is already available in one form or the other in records 
and reports like the Census, Government reports, Annual reports 
and so on. 


CHAPTER 4 


SAMPLE 
DESIGN 


INTRODUCTION 


When you decide to undertake a survey research project you are 
really talking, among other things, about interviewing a sizeable 
number of persons. So the questions you have to contend with are: 


(1) What is the optimum number of persons that you must inter- 
view in order to get results that would represent the population, 
from which these persons to be interviewed are drawn? In 
other words, what should be the sample size? and 

(2) What is the appropriate procedure that you must follow in 
order to get a representative sample? 


This chapter is devoted to these two questions. More specifically, 
in order to answer the first question we shall consider, the 
statistical and the substantive. This will be in the context of 
whether all members of a universe (also known as the total 
population) need to be studied or only some of them. 


As regards the second question, we shall concentrate on a few 
commonly used ‘probability’ and ‘non-probability’ techniques. 


But, let us first consider the basic question of all versus some. 
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Broadly speaking, you may COVCT. 

(1) all units in the total population of the study, 1.é., take a Census 
survey, OF 

(2) a section of units which would be representative of the 
universe in respect of the characteristics which are being 
studied, i.e., do a sample survey. 


Now, a Census, as it is usually understood in research, is a ‘count’ 
of the population of a country, and some characteristics of the 
population as on a given date. In research, however, while the 
general concept indicated above is retained to some degree, it has 
a more restricted meaning. In that a Census survey implies a study 
wherein every unit in the universe of study is contacted to gather 
the relevant information. 


A Census study is resorted to only when all the units in a given 
universe are to be studied. Though it is often presumed that this 
would ensure greater accuracy and reliability of results, this is not 
necessarily so because the total error may be great. The major 
disadvantages of a Census study in relation to other designs are: 


(1) errors in coverage 

(2) it involves great expenditure 

(3) it consumes a great deal of time 

(4) it necessitates an elaborate organisation 

(5) as the personnel required is large, standardisation of data 
collection is difficult because of individual bias 

(6) time lag between collection of data and final reporting is long. 


The above criticism against a Census study holds good only when 
the number of units to be studied is very large. Where, however, 
the number is small and the personnel required is also small, the 
Census study may be used effectively. Moreover, the extent to 
which the census study may be utilised would depend on how 
exactly the information is to be collected, i.e., by interviewing 
individually every respondent or unit or alternatively, every unit 
is requested to fill in a pre-tested questionnaire. If the latter is 
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used, naturally a wider coverage of units would be possible as 
compared with the earlier procedure. 


The sample survey aims to cover a scientifically selected repre- 
sentative cross section of the universe. Thus, the data that are 
collected will be representative of the population from which the 
sample has been selected. 


The main features of the sample survey are: 


(1) it reduces the cost of a study as compared to a Census study. 
The extent to which this reduction can be made will depend 
on the objectives of the study, availability of basic information 
on the characteristics of the universe, the availability of the 
necessary sample frames, the degree of accuracy required, the 
sample size, etc. 

(2) it reduces the time-lag, since the time taken to complete a 
sample survey is less than that for a Census study of the same 
problem 

(3) it increases the quality and accuracy of data. Since a sample 
study will require a comparatively small personnel organisa- 
tion, the quality of the work through closer supervision will 
be high. Further, the amount of bias and non-sampling errors 
can be greatly reduced 

(4) it permits a wider coverage of information. Since fewer per- 
sons are included in a sample survey than in a Census study, 
it may be possible to not only thoroughly study some aspects 
of a problem but it may also be feasible to cover more items 
of information 

(5) it reduces the number of non-responses, since the whole 
population may not be available because of absence during 
the period of study, non-cooperation, etc. 

(6) it will be more reliable and accurate than Census studies 

(7) it provides a basis for estimating the errors arising out of 
sampling procedures. 
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As against these, the main disadvantages of the sample study are: 


(1) It is amenable to error, both due to the sampling process as 
well as non-sampling errors like investigator bias, reporting 
error, respondent’s error, measurement error, efc. 

(2) This is quite a rigid form of study. Hence, a total picture of 
the problem may not be obtained 

(3) High non-responses adversely affect the utility of the study 

(4) The data collected are so voluminous that processing and 
analysis become difficult 

(5) The data are mainly statistically processed, which leaves out 
the qualitative aspect. 


SAMPLE SIZE 


Having decided that you will undertake a sample survey, the 
question that faces you is: what should be the sample size of your 
survey? The size is determined after considering a number of 
factors, many of which have been enumerated above. You must 
remember that the optimum size is that which is: 


(1) representative of the universe 

(2) provides for efficient handling 

(3) contains a degree of flexibility 

(4) has an eye on the reliability of the results 

(5) the degree of precision required 

(6) small enough to reduce expenses to a minimum and avoid 
wastage of funds 

(7) large enough to eliminate, or keep to a minimum, sampling 
errors, and 

(8) the time and personnel available. 


It would also be useful to keep in mind that: 


(1) the more homogeneous the universe, the-smaller will be the 
sample size. 

(2) the larger the number of variables to be studied, the larger 
will be the sample size. 
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(3) the method and technique of data collection also influences 
the design, because non-responses may be large. 

(4) the size also influences and is in turn influenced largely by 
the sampling procedure or technique. 


But ultimately, if the sample size is not too small, any reasonable 
size should give about the same results as any other size. This is 
well illustrated by the example below. 


I undertook a study of the members of India’s first Lok Sabha to 
find out whether samples of different sizes, each selected inde- 
pendently from the same universe, i.e., members of the Lok Sabha, 
would significantly differ in their results. I first obtained from a 
document, Who’s Who the list of all 496 members of the First 
Lok Sabha. This document contained quite a bit of information 
on each of the members and, for the purpose of this study, I 
selected the following items of information for analysis: 


1. Past/Present Membership in: 
a. Congress Party 

b. Local Government 

c. Assembly and local levels 
d. Private Organisation 

Age 

Lawyer 

Foreign trips 

Political prisoner 

Publications 


DA 


Coming to the question of sampling, I decided to select four 
samples as follows: | 


1. a50 per cent sample of 496 = 248 members 

2. a 20 per cent sample of 496 = 100 members (99.2 rounded 
to 100) : 

3. a 10 per cent sample of 496 = 50 members (49.6 rounded to 
100) 
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4. a5 per cent sample of 496 = 25 members (24.8 rounded to 


“5); 


All these four samples were selected by using the same sampling 
procedure viz. systematic sample with a random start. 


The results that were obtained have been presented below. 


DISTRIBUTION OF MEMBERS BY SELECT 


CHARACTERISTICS 
(in percentage) 
Sample size— 50% 20% 10% 5% Z*Score 
characteristics (248) (100) (50) (25) 50% vs 5% 
Congress Party 74.3 74 74 72 0.2 
Local Govt. 32.7 27 26 24 0.6 
Local Assembly 43.5 42 44 48 0.5 
Private Orgn. 86.3 86 90 92 0.6 
Upto 40 yrs 16.1 15 18 = fe] 
Upto 50 yrs 17.4 20 18 24 0.8 
Upto 60 yrs oe 39 38 36 0.1 
Upto 70 yrs 16.1 14 14 20 0.5 
Over 70 yrs — 2 4 4 1.0 
No Information 9.3 9 6 12 0.4 
Lawyers 31.7 30 34 28 0.4 
Foreign visits 21.4 22 16 20 0.2 
Pol. Prisoners 35.4 37 30 32 0.3 
Publications 17.4 22 20 32 2.0 


see inns 


* The difference will be significant if Z is equal or greater than 1.65. 


The following conclusion may be drawn from the above. Except 
for two items, those upto 40 years of age and those with publi- 
cations, differences between the largest sample (1 in 2) and the 
smallest (1 in 20) are not statistically significant. That is to say 
both these samples may have been selected from the same 
universe. 
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So, any sample of a reasonable size would yield the same results 
as any other sample, subject of course to the various precautions 
that have been mentioned earlier. 


But in practice, you will still want to have some rule by which 
you can decide the optimum sample size. In fact, there are two 
different ways in which you can arrive at the appropriate answer. 
Let us label these as the ‘technical or statistical’ method, and the 
‘substantive or analytical’ method of determining sample size. 


The ‘Technical’ Procedure. In this procedure, the sample size iS 
determined by the following statistical formula: 


N = T* (1—T)/SEp” 


where N is the sample size, 

T is the estimated proportion of a characteristic in the population, 
SEp is the standard error permitted in the sample. 

Let us apply this formula to two examples. 


Example One: 
T =.5 and SEp = .05 
then N = (5 * (1-.5)) / (.05 * .05) 
= (5 * .5) / (.05 * .05) 


= 100 
Example Two: 
T =.5 and SEp = .01 
then N = (5 * (1-.5)) / (01 * 01) 


= (,5°* 5) /¢01 * 01) 

= 2500 
In the first example we get sample size N = 100 and in the second 
example the sample size N = 2500. The difference is because of 
the difference in the permitted level of error, viz., 5 per cent versus 
1 per cent. 
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You will also notice that in applying the formula, the T has been 
pegged at .5 or that the characteristic concerned has two sub 
categories, each probably accounting for 50 per cent of the 
distribution in the sample. (An example here would be the per- 
centage of males and of females). Two questions arise here. How 
does one handle a situation when we really do not know what T 
will be? Secondly, what if the characteristics selected have more 
than two sub categories? Also, a decision has to be taken as to 
how much error in the estimate between the sample and population 
results may be accepted. 


It is because of the above complications, and more often than not 
the ‘statistics phobia’, that many a student does not like to use 
the technical or statistical procedure for computing the sample 
size. Just in case you are one of those who would like to avoid 
Statistics till it becomes inevitable, the second procedure, the 
substantive one comes to your rescue. 


The Substantive Procedure 


A major feature of this procedure is that it gives some prior 
information about the size of the universe because as the answer 
changes the ‘total number’ would also change. For example, in a 
community of 1000, it is probable that about 400 may be children 
(i.e., less than 15 years of age). Of the remaining 600, ap- 
proximately between 280 and 320 may be women. 


Again these 1000 may be made up of about 150 to 200 families. 
So, while 600 adults may be a large number, 150 to 200 heads 
of families may be a manageable moderate number. Again, if 
there is a high degree of homogeneity among these people (it has 
to be decided homogencity in respect of which characteristics), it 
may not be necessary to cover all 150 to 200 families, perhaps 
just SO to 75 may be sufficient. But generally the rule of the 
thumb that is followed is that if the total is manageable, cover all 
of them so that it gives everyone in the community the feeling 
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that they have not been left out of the project and it facilitates 
initial ‘rapport building’. 

A question that has yet to be answered is this: How many are to 
be interviewed? Earlier it was mentioned, ‘all if not to many’, 
and a ‘sample if there are very many’. While this sounds nice 
and ‘catchy’ it does not really tell us very much. So let us translate 
it into some rule-of-thumb. 


To do so, let us go back a little to the objectives of the study, 
for it is this that gives direction to the whole study. Basically, the 
question is what are the differences in answers (e.g., the most 
serious problem in the village — the dependent variable), between 
those with different sub-set of characteristics (the independent 
variable — e.g., those in ‘high’ socio-economic status group and 
those in the ‘low status group’, or the males and the females, or 
the young and the not so young, and so on). This kind of 
comparison requires that ‘theoretically’ we provide for a certain 
number of respondents to be in each sub-set of a characteristic 
(e.g., X number of males and the same number of females, Y 
number of young and the same number of not so young, and so 
on). At the same time, we have also to predetermine the number 
of categories of the dependent variable. Usually, it is optimal to 
have just three categories (e.g., those favourable, those neutral 
and those unfavourable). So again, ‘theoretically speaking’, the 
following would be the typical ‘distribution of respondents’ on 
the two items (independent and dependent variables). Look at the 
following ‘theoretical’ result presented in tabular form. 


——— Eee, 


Age No Personal Community 
problem problem 

ES Eee oe SEP Oe es ee ee See es 
Young 20 20 20 
Middle Age 20 20 20 
Old 20 20 20 
EE ————————eeeEeEe———— rr eee ,”~C:”:t:t:::— eee 
Total 60 60 60 


Co ———————— 


16 Survey Research For Social Work 


Thus the theoretical total would be 180 persons to be interviewed 
for any meaningful analysis from a statistical viewpoint. One could 
however go a step further and say, as a ‘safety margin’ add 10 
per cent to the above, round it to 200 (180 + 18 = 198 (10 per 
cent of 180) rounded to 200. 


So the rule-of-thumb would be that for most studies this would 
be the approximate size if both the independent and the dependent 
variables have three sub-categories each. But if one or the other 
(i.e., all independent or all dependent variables) are reduced to 
just two categories (e.g., men/women, young/old, and so on) the 
sample size will shrink to just 135 (2 * 3 * 20 = 120 + 12 = 132 
rounded to 135). If both the variables are reduced to just two 
categories each, then the sample is just 90 (2 * 2 * 20 = 80 + 8 
= 88 rounded to 90). But it must be remembered here, that too 
much of artificial ‘data reduction’ will result in loss of quality 
and variations between a number of sub-groups. Economy, on the 
one hand, can lead to distortion on the other. 


Coming to the second question, if all the eligible respondents are 
to be covered the question of ‘how to select them’ does not really 
arise. If however, it is decided to take only a part of the total, 
then it would be necessary to follow some predetermined proce- 
dure. This is discussed later. 


SAMPLE SELECTION 


SOME CONCEPTS IN SAMPLING 


The most important consideration in sampling is that it must have 
an element of randomness. By randomness is implied, that you 
do not introduce any bias in the selection of a particular unit. In 
other words, under normal conditions, each and every unit in the 
universe must have an equal chance of being selected to represent 
the universe. 


Thus, a sample of units from a universe will be scientific only 
when it is an unbiased selection and is representative of the 
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- universe. As we have already seen earlier, the universe is the 
totality of units from which the sample is selected or derived. A 
sample is said to be representative when its characteristics are 
typical or near typical of the universe from which it is selected. 
That is to say, the values of characteristics obtained for the sample 
would be identical or near identical to the universe from which 
the sample is selected. Further, a representative sample is one 
which will enable you to make a generalisation about certain 
characteristics of the universe. 


SAMPLE FRAME 


We have just seen that in a sample survey, every unit in the 
universe must have equal chances of being included in the sample. 
To ensure this, it is necessary to have a complete list of every 
unit in the universe. This list is usually known as the sample 
frame and is a complete listing of all eligible units —be it 
individuals, families or any other unit of study. Such lists contain 
information about each unit that will enable you to identify each 
and every unit. It would, invariably, contain the name and address. 
For example, the attendance register of an Adult Education 
Programme (AEP) will give you the relevant details of the adults 
who attend the programme, and even of those who have dropped 
out of the programme. 


A sample frame is considered to be complete or perfect if no unit 
has been omitted from the list. Unfortunately, it is not always that 
one comes across such a list, because it may suffer from such 
defects as incompleteness, inaccurate information and obsolete 
data. But if there is some way of finding out the degree of 
incompleteness, necessary efforts could be made to correct the 
frame and appropriate sampling adopted. In practice, when a 
complete frame is not available, or even the existing ones are 
known to be defective, it is checked and corrected as far as 
possible or a new frame prepared. Before discussing this further, 
it may be worthwhile indicating, in the first instance, some of the 
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major sample frames that are used and available in India for the . 
purpose of undertaking surveys. 


The two most commonly used sampling frames that can be used 
in both urban and rural areas are the Population Census List and 
the Electoral Rolls. 


Population Census List 


The most common list that is available to any researcher is the 
Population Census Enumeration List which may be detailed or 
abridged. Among other data, the census household list contains 
information on the locality, Census household number, name of 
the head of the household, the members in the household, sex, 
age, etc. Out of the detailed household list, an abridged list is 
prepared containing information about the sub-number of each 
census house and the name of the head of the household. 


On going through these lists, one will find that vacant buildings, 
shops, hotels, restaurants and other structures are also enumerated. 
Hence, these have to be deleted in preparing the final frame for 
sampling. 


Some of the defects that may creep into such a list are: 

1. The list may have omitted, due to oversight, some of the 
households 

2. Sometimes when two or more households are living in the 
same room, they may be enumerated as only one household 

3. There may be duplication of individuals or households, par- 
ticularly when the enumeration is carried out over a long 
period of time and the particular person or household has 
shifted to another locality in the same town or city 

4. Households or individuals moving into the locality after the 
enumeration may not have been enumerated 

5. The list may include households which have moved out of 
the jurisdiction since the enumeration has been completed 
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6. Some of the households enumerated may have shifted and 
been replaced by another household 

7. Some structures may have been pulled down for road con- 
struction, the numbering of houses may be defective. 


In view of the above, it is generally observed that while the Census 
Enumeration List may be on the whole quite reliable as a sample 
frame, for the first few years after the Census year, its utility as 
a frame declines with time. However, it has to be pointed out at 
this stage that this is more true of urban areas rather than rural 
areas. Further, although the frame may not be a complete one for 
practical purposes, the necessary reliability checks may be carried 
out, corrections made and then utilised. 


Electoral Rolls 


The Electoral Rolls are prepared just before an election — nation- 
al, state, or municipal. They contain information regarding each 
eligible voter in terms of the location of his or her household, the 
name of the voter, sex and age. This list may be revised just 
before a particular election. Hence, it contains a section pertaining 
to additions and deletions of voters. The shortcomings discussed 
in regard to the Census Enumeration List are also present in the 
Electoral Rolls. 


Also, since the Rolls are prepared for purposes of elections, these 
contain the names of adults only and that too of those residing in 
the area for a minimum period of time. Furthermore, the Rolls 
may contain the names of persons who are not normally residents 
in that town, but who by virtue of long residence in earlier periods 
have continued having their names on the Rolls. Similarly, some 
of uiose residing there may have their names registered in other 
towns. But here too, as in the case of the Census Rolls, the 
Electoral Rolls have to be first checked and corrected before being 


used for sampling. 
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It is not always possible to undertake a complete check and 
correction of an Electoral Roll for a town. Considerations of time, 
financial resources, purpose of study, efc., may not enable a 
researcher to go through this tedious procedures. 


Attendance Register 


When studies are to be carried out in an organisation like a factory, 
involving those who are employed in that organisation, the best 
frame is the Attendance Register or Muster Roll. Since it is utilised 
in the day-to-day working of the organisation it is expected that 
this Register will always be maintained up to date and contain 
considerable information about each person. However, this is not 
always So. 


SAMPLING PROCEDURES 


The sampling in any study will depend in the first place en the 
availability of some of the basic characteristics of the universe to 
be studied. Some of the other factors influencing the procedure 
are the same as those pertaining to the sample size, viz, time, 
resources, accuracy etc. 


The major types of sampling that are usually utilised in sample 
surveys are: 
1. Probability Sampling 
(a) simple random sampling 
(b) systematic sampling — 
(c) stratified sampling 
2. Judgement Sampling 


PROBABILITY SAMPLING 


The different sampling procedure which are included in this type 
of sample design have in common one characteristic, viz, the units 
are selected automatically. A second common feature is that the 
data from units selected by this type of sampling can be statisti- 
cally treated for bias and error, i.e., sampling tolerance can be 
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estimated and controlled. We shall now describe each of the 
procedures. 


Simple Random Sampling 


The easiest way of selecting at random a representative number 
of units from a universe is by the use of simple random sarhpling. 
The procedure may be any one of the following: 


(i) Lottery 

(ii) Pinprick 

(iii) Random Numbers 
The common feature of all these is that they follow an absolutely 
random procedure. The selection is based on scientific and rational 
principles. The first one is often used in prize competitions where 
the ‘lucky number’ determines the ‘lucky person’. 


Though the lottery procedure of drawing numbers isa similar 
one, yet it may be worthwhile to briefly indicate the procedure 
here. Let us presume that there are 50 persons from which 10 are 
to be drawn. Each of the numbers 1-50 are written on SO different 
slips of paper, each of equal size and then folded symmetrically 
and identically and put into a box. These are now thoroughly 
mixed. Then one person picks up any one folded slip at random. 
The person is picked and the number recorded. The second 
procedure is followed till the 10 numbers are picked out. It hardly 
needs to be pointed out that as each slip is picked and the number 
recorded, it is discarded and not returned to the box. 


The pinprick procedure is often used when one has to choose 
from a number of alternatives which are initially placed in a 
circular form on one sheet of paper. Then one is blindfolded or 
with closed eyes is required to prick at 10 different places in the 
paper. These form the sample. When the same number is pricked 
twice, the second is not taken into considcration. 


The random number procedure is used by reading out the number 
from a ‘Table of Random Numbers’. This overcomes the 
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shortcomings of the above two procedures. Most statistical books 
contain this table. The procedure is as follows: 


Suppose, again there are 50 persons from whom the 10 are to be 
selected. Every person here is assigned a number of two digits 
i.e., 01, 02, 03,......50. Then, the Table of Random Numbers is 
referred to by selecting any page but in that page two columns 
at a time should be read off in proper sequence. Let us say the 
Table is as follows: 


24 36 

37 32 

59 48 

04 29 

99 13 etc. 


Reading from the first two columns downwards we get, 24, 37, 
59, 04, 99, and so on. As each number which falls between 01 
and 50 is read out, it is recorded. However, if the same number 
is repeated, it is ignored and the next number is read. Again, any 
number above 50 is also ignored. In the above example, 59, 99, 
are ignored. This procedure is followed till the requisite 10 
numbers are obtained. 


The greatest drawback of these procedures is that these are 
unwieldy when a large universe is involved, and where every item 
of the universe has not been accounted for i.e., the frame is 
incomplete. Thus, on the whole, it is a very tedious procedure. 


Systematic Sampling 


In the case of systematic sampling we first serially number the 
population as in the above example from 1 to 50. Let us again 
decide to take a sample of size 10. Now divide the population of 
50 into a number of groups where this number is equal to the 
sample size, i.e., 10 groups of 5 each. The groups would be 1-5, 
6-10, 11-15, 16-20, 21- 25 etc. to 46-50. If the population size is 
exactly divisible by the sample size, then the number of individuals 
in each group will be equal to this quotient as seen above. 
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If the population size is not exactly divisible by the sample, then 
the number of individuals in each group will be equal to the 
integral part of the quotient. Thus since the group sizes are equal, 
we may number the individuals in each group separately. Then 
from the first group, one unit is selected at random (by the simple 
random sampling discussed earlier) say 3, and then from the other 
groups one unit each is selected in such a way that their relative 
positions within the groups are the same as the positions of the 
unit selected from first group, i.e., in the second group, 8 is 
selected, then 13, and so on, till 48. While the first number 
selected, i.e., 3 has a randomness about the selection of the units, 
some systematic element is involved. What has actually been done 
is to add 5 to 3 to get 8, then 8 + 5 to get 13, etc. This value 
which is added is called the sampling interval. 


Let us take another example. 


Example — 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.............. 105. 


Let us say it is desired to select 10 units out of this total 105 
units. The first thing to be determined is the sampling ratio: 


R = N/n = 105/10 = 10.5 


Thus, the interval is 10, the largest integer (whole number) 
contained in R. Hence, out of the first 10 numbers, the first 
sampling unit may be selected by means of random selection (say 
7). This first unit may be selected using simple random sampling. 
Then, to this number is added 10 (i.e., sampling interval) and so 
the next unit to be in the sample is 17, the third 27, and so on 
upto 97, thereby yielding us 10 units which will form the sample 
for the study. 


It will thus be seen that this method affords a quickef and easier 
procedure in drawing the sample. This is of additional advantage 
when the drawing is to be done by a person other than the 
statistician herself. 


84 Survey Research For Social Work 


Further, this ensures a spread of the sample over the universe 
unlike the earlier ones where there is a chance of numbers which 
are close together being picked up. Moreover, it ensures greater 
representation. 


Stratified Sampling 


Stratification means that the population to be studied is divided 
into a number of homogeneous strata or groups before any sam- 
pling is done. This is particularly useful when the population is 
known to be heterogeneous. 


The strata may be formed either on a geographical basis, where 
a region is divided into a number of areas or on the basis of 
homogeneity of the various groups comprising the population. 
Once the stratification is done, the procedure used may be any 
one of the above. ' 


Stratified sampling is resorted to: 

(a) when there is heterogeneity of groups in the population 

(b) only if these characteristics which are heterogeneous have any 
relevance to the study 

(c) if data of known precision are wanted for the strata of the 
population and 

(d) sampling problems differ markedly in different parts of the 
population. It need hardly be pointed out that stratification 
may increase the cost of study. 


Example: In a community study, the stratified sampling procedure 
may be adopted. On the basis of an enumeration of the locality, 
a list ig prepared of all the heads of household living there. Then 
out of this, two supplementary lists are prepared according to 
whether a person is a ‘landlord’ or a ‘tenant’, i.e., they are 
stratified according to the above characteristics. Within each 
independent list the actual interviewees are selected by resorting 
to “systematic sampling with a random start’. This enables them 
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to make independent estimates for each stratum and, by combining 
the results of the two strata to generalise for the whole population. 


Some of the characteristics by which stratification may be done 
are: 

(a) race 

(b) religion 

(c) education 

(d) socio-economic status 

(e) age 

(f) geographical region, etc 


These are logical bases for stratification. 


JUDGEMENT SAMPLING 


There are also a number of alternate procedures which arise 
especially when basic information to implement probability sam- 
pling is not possible. So whatever procedure we select should not 
be guided by deliberate bias in selecting the people to be inter- 
viewed. Those selected would collectively reflect the community 
in its views. To ensure balanced distribution of the sample (i.e., 
we should spread the sample over the whole population) the 
quickest, most pragmatic approach in rural areas would be the 
following. 


First (in your mind or on a rough map of the village), divide the 

village into four to eight more or less equal parts. The ‘equal’ 
~ refers not to the physical land but the distribution of the population 
so that each part more or less has the same number of people. 
The physical land distribution is not meaningful because one or 
two of the parts may be barren or occupied by grazing grounds, 
or cultivable land, or the village pond, etc. And the aim is to 
sample people not physical space as such. 


In case you find it difficult to adopt the above, then walk through 
the village, identify the different paths, lanes or roads, etc., that 
is adopt any general criterion on which the village can be divided 
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into components with the distribution of people more or less in 
equal numbers. ) 


If even after the second option it is seen that the number, say, in 
the different streets or lanes is not the same, then roughly estimate 
the number on each street, separately for the left of the path and 
its right. If even this fails, as you walk around the village estimate 
the approximate length of the paths. This is your ‘total unit of 
length’. Now divide this by the sample size and you will get 
‘approximate distance between two respondents’. So at this regular 
interval, select your respondents as you walk around the village. 
To be doubly sure, take the first unit from one side of the path 
and the next from the other side so that even here there is an 
unbiased distribution around the village. 


Now we are ready to apply the above information to our four 
examples. 


FIRST EXAMPLE: 
WOMEN AND EMPLOYMENT 


Given the above alternatives, one can adopt more than one pro- 
cedure to select the representative sample. However the 
rule-of-thumb would be to first decide the sample size. The larger 
the sample, the more complex the selection procedure is likely to be. 


Now it is not always possible, or necessary to adopt a rigorous 
procedure for determining the sample size. Personally, very often 
I use a very practical rule-of-thumb and it has served the projects 
and the sponsors quite well. The objectives show that we have to 
compare the sub-groups of respondents in respect of their views 
on whether or not women should be employed. So let us take the 
first part of the statement, i.e., the sub-groups. I would assume 
that most often the most pragmatic results are obtained when you 
have just 2 or 3 sub-groups. Invariably, sex has only 2 sub-groups 
— men and women. Age could have 3 to 5 sub-groups. In the 
former case you have the young, the middle aged and the old. In 
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the latter you have the youths, the young adults, the middle aged, 
old adults and seniors. But effectively, one could not envisage 5 
sub-categories. So I would recommend just 3. 


Now coming to the dependent variable. Here too you have just 2 
sub-groups. Those who approve of women’s employment and 
those who do not approve of it. But you could argue that there 
may be a sizeable number who cannot make up their mind. So 
they will either say ‘don’t know’ or they will say ‘depends’, 
meaning that the answer would depend on certain circumstances. 
So you are left with 3 sub-groups here as well. 


In sum then we have a total of 9 independent sub-categories (3 
of the independent variable by 3 of the dependent variable). But 
before that just one more item has to be added. How many are 
required, theoretically speaking, for a meaningful analysis? Quite 
a few statisticians say it has to be 20 for stable results. Some 
recommend the number 10. There are, however, statistical tests 
that point out that corrections in the results must be made when 
the cell value is less than 5. Now in practice, unless we consciously 
select our respondents, it would be extremely difficult to get that 
‘magic number’ be it 20 or 10 or even 5 in each cell. However, 
what if there are no such 5 or 10 or 20 or whatever number you 
stipulate. In fact, in a really good random sample (good in that 
the selection is done correctly) what you gct is what is in the 
population. 


So keeping in mind the above, I would recommend that you take 
a sample of 20 * 9 = 180 + 10% of 180 (rounded off) = 200 as 
the minimum sample size for moderate to large populations. If, 
however, the population is large and heterogeneous, then you may 
even want to do a complex analysis by taking into consideration 
not just one, but two independent variables or even more than 
two. In that case, multiply the basic 9 by 3 for each independent 
variable that you add and take the 10% of the result and round 
it off. So if you have a total of 4 variables (3 independent and 1 
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dependent), you have a total of 3 * 3 * 3 * 3 * 20 = 1620 + 10% 
= (rounded to) 1800 or say 2000. 


The more complex the proposed analysis is going to be, the larger 
should be the number provided for in each cell. 


Let us assume for this study that we are not planning a more 
complex analysis.than a 3 * 3 * 3 matrix. So we have 27 cells * 
20 per cell (hopefully), we get a rounded off figure of 600. That 
will be our sample size. 


The next step is to decide how we shall select the respondents in 
such a way that they would collectively reflect the universe from 
which they are selected. There are as you have already seen, so 
many different ways of doing it. Yet, the sample is relatively 
small compared to the total population say of Bombay. 


First a sample frame. You know many are available but the most 
recent one and. probably the least distorted would be the Election 
Rolls. But someone may argue that even here quite a few people 
have been left out. In fact, some would even argue that quite a 
few pockets have been left out. Assuming these complaints to be 
true (but not necessarily accepting the arguments), we can use the 
electoral system and yet bypass the problem. For this, you must 
know a little about how the system works (for voting purposes 
only). 


First and foremost, the metropolis is divided into a large number 
of constituencies. Within each are a number of polling booths. 
Each booth, Iet us say, accounts for a more or less fixed number 
of voters. So you can use both these frames, i.e., the constituencies 
list followed by the polling booth list. 


Now take a simple random sample of constituencies. Assuming 
that there are 12 constituencies and you take a 25 per cent sample, 
you get three sample constituencies. Within each sample con- 
stituency, take a simple random ‘list of polling booths. Now in 
each constituency, there will be say 100 polling booths. So take 
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10 per cent of these. You have 10 polling booths. Now to equitably 
distribute the sample of 600, decide to take 60 adults from each 
sample booth segment. 


So we come to the final stage. You have a number of choices. If 
you think that the Electoral Rolls are defective, there is no point 
using these as the third level sample frame. Instead, look through 
the sample polls descriptions and you will find that they 
enumerate, for each poll area, the physical coverage in terms of 
roads and buildings. So you can take one of two paths. First, 
consider taking a simple random sample of four roads and within 
each interview 15 persons. Now of course you have entered the 
realm of purposive sampling. However, you could physically 
distribute the sample of 15 so that these cover the whole road, 5 
from the top of the road, 5 from the middle and 5 from the end 
of the road. And then each of these 5 can be distributed on both 
sides of the road. And within the house you select randomly one 
adult. 


Assuming that the above sounds too arbitrary you could make a 
quick check to ascertain whether any building has been left out 
of the Roll listing, and then after adding it to the frame, take a 
sample of buildings in the sampled booth areas, and within that 
take a sample of adults. 


SECOND EXAMPLE: 
HOUSING SITUATION 


First we tackle the question of type of survey. Since we know for 
a fact that a metropolis by definition is a very large agglomerate, 
we have to settle for a sample survey of households. 


The next question is: What should be the sample size, for this 
will determine how the sample is to be selected. The larger the 
size, the more elaborate the procedure. Now using the rule-of- 
thumb mentioncd earlier (you will recall that it can be determined 
on the basis of the key variables and their expected responses), 
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we have the following variables identified as the ones which are 
most influential in determining housing preference: 


e socio economic status (three groups) 

e resent type of dwelling (four groups) 

e referred type of dwelling (four types) 
3*4* 4 = 48 * 20 = 960 + 20% (of 960) = 1152 
say 1200 


You will notice that we have not taken into account here the 
actual size of the universe, which will indeed be very large (in 
lakhs). 


This takes us to the third question: how do we select this number 
of households, for the study is of households as represented by 
their heads. 


Just for the sake of argument let us assume that the universe 
consists of 12,00,000 households. We are intending to cover just 
0.1 per cent of the households almost like a few needles in a 
haystack. One way to overcome this, is to reduce the bulk of 12 
lakhs into smaller ‘portions’ so that we can use a reasonable 
procedure to select our sample. To do this we can resort to a 
multi-stage sampling procedure because any city is divided for 
administrative reasons (municipal or law and order or census 
operations or elections and so on) into convenient administrative 
levels. Thus, the city will be divided into wards, each ward into 
circles, and each circle into blocks. Alternately, we have the 
parliamentary seat constituency, with its state assembly constit- 
uencies, with a further break-up into election polls, and finally 
the polling booths. Similarly, the Census operations have a pro- 
cedure for dividing the city into smaller geographical segments 
for work. 


For the study, at hand, we can adopt a two-stage procedure so 
that the whole task does not get too complicated. First, we take 
the Electoral Rolls. We start with the list of electoral constituencies 
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in the city (we assume this number to be 24), which have among 
them a total of 2220 election polls. From this 2220 we decide to 
take a 5 per cent simple random sample of polls i.e., 110 by 
making use of the Table of Random Numbers (TRN). Given the 
norm of 1000 voters per poll, and assuming that three persons 
from each household vote, the number of households covered by 
each poll would be about 333 households per poll. So the 110 
polls will collectively account for 110 * 333 = 36,630 households. 
We need only 1200 or approximately 1 in 30 households or 11 
to 12 households per poll. 


Armed with this number, we now get hold of the sample poll lists 
and through simple random sampling (using again the TRN), select 
12 respondents, record their names and addresses for interviewing. 
Since we will in actuality have 110 * 12 or 1320, there is a slight 
excess of 120 but this will get adjusted against non responses of 
a few respondents. 


The whole procedure seems complex but in doing it, you will find 
the actual work is not all that difficult. 


THIRD EXAMPLE: 
MEASURING CONSCIOUSNESS 


Keeping in view that the source of data for this study is the 
leamer-participants in the different adult education classes, the 
first question is this: What is the size of the universe? This 
question contains within it two sub-questions. The first is: What 
is the number of adult education centres? The second sub-question 
is: What is the number of leamers in each Adult Education 
Centres? 


To find the answer to the question, we can refer to the official 
records and at the same time get an idea of the geographical 
distribution of the AECs. The records reveal that there are a total 
of 193 adult education centres distributed all over the State. The 
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records also reveal that the number of learners was 5790 or an 
average of 30 per centre. 


The second major question now is: How many are to be selected 
and how shall we select the required number? Given the fact that 
the adult education centres are distributed all over the State we 
have two choices. First we can select a sample of districts from 
the total of districts in the State. But the question is this: What 
is the relevance of the district as a basis for selection. Well, if 
we have no particylar argument to show that we could anticipate 
inter district differences, there is no reason to ‘complicate’ our 
sample design. 


In any case we can argue that whatever the procedure, if we adopt 
a probability sample, we will get a representative sample of 
respondents. So we can settle for a second way of sampling. The 
decision is this: we will select 25 per cent of the AECs through 
a simple random sampling procedure using a Table of Random 
Numbers. So we get in the first stage a total of 48 AECs. Having 
completed our first stage sampling, the second stage is less 
difficult. We know that these 48 AECs will account for an average 
of 1440 learners. This is not a very large number, but hypotheti- 
cally keeping in mind our limited resources including time, money 
and manpower and yet ensuring a representative sample, we decide 
to select one-half of the active learners in each AEC. Why the 
reference to ‘active learners’? This is because we are likely to 
find that a number of leamers on the register are ’frequent’ 
absentees, and that some have dropped out of the programme. 


Assuming that they constitute about 15 per cent of the total (say 
4 per centre) we have an average of 26 active learners. We then 
select 13 from each centre (it is more important to emphasize 50 
per cent rather than 13 because in some centres there may be 
more that 26 active learners and in other less than 26. We have 
to select 50 per cent of the actual active learners). 
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FOURTH EXAMPLE: 
DCPs AND HUMAN DEVELOPMENT 


Let us assume here that a national organisation has decided to do 
this study among project directors with whom it interacts. The 
organisation found from its records that it had just over 1000 
‘active’ contact directors on its records. In view of the fact that 
these were spread all over India and taking into account that it 
had a complete sample frame, the decision taken was to select its 
sample through systematic sampling with a random start. Since 
its projects could be categorised into five groups of activities and 
the directors themselves could be anticipated to fall into one of 
three levels of understanding human development, the expected 
sample size is 5 * 3 * 20 or 300. In order -to provide for 
replacements in case some project directors from the list were 
unavailable for any reason whatsoever, another 10 per cent was 
added to yield a total of 330. So one in every three is selected, 
the first being one of the first three followed by every third. If 
the first to be selected is number 2, then the subsequent respon- 
dents would be 5, 8, 11, 14, 17, 20 and so on. 


CHAPTER 5 


ANALYSIS 
DESIGN 


Analysis design is very often ignored when a research study is 
being planned. As a result, you are more often than not, likely to 
collect information that you are unable to ‘fit’ into the ‘canvas’ 
of your study. You also do not do justice to the data that you 
have so painstakingly collected. 


What is this Analysis Design and what is its function? The analysis 
design is concerned with planning the most optimal manner in 
which the voluminous data collected, can be summarised and 
analysed, to arrive at the answers to the questions that have been 
raised in the study and represented by the objectives. 


The focus of the analysis design in sample surveys is essentially 
in preparing for the statistical analysis of quantitative or quantified 
qualitative data rather than the analysis of qualitative information. 
So one must be familiar with, and have some basic skills at least 
in statistics. This will be taken up only later in this Primer. For 
the present it would suffice to be aware of the essential logic of 
analysis design. As you gain more knowledge about data process- 
ing and analysis, you will hopefully increasingly appreciate the 
importance of the analysis design in planning a research study. 
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For the purpose of demonstrating how the analysis design is 
carried out, let us take the four examples again. 


FIRST EXAMPLE: 
WOMEN AND EMPLOYMENT 


You will recall that this study has just two objectives. The first 
is to find out how many respondents are in favour of women 
taking up employment. To answer this we have the following 
question put to the sample of 600 respondents: ‘in your opinion, 
should women take up employment?’ We anticipate one of the 
following answers: Yes/No/Depends, and Don’t know/Can’t 
say/No opinion. j 


Therefore, what is the analysis that we will adopt to fulfil this 
objective? To begin with, we will COUNT: 


How many said YES (in favour) 

How many said NO (against) 

How many said DEPENDS (conditional yes/no) 

How many said DON’T KNOW/CAN’T SAY/NO OPINION 


So all the different answers will be summarised into four groups 
as follows: 


Response Number Per cent 


Favour 

Depends 

Do not favour 

Don’t know/No response 
Total Answers 


You can see that the 600 respondents have been reduced to four 
sub-groups. Then using appropriate statistical procedures, we can 
estimate from these results the distribution for the population from 
which the sample was drawn. 


The second objective calls for comparing the views of sub- 
categories of respondents. In other words, we have to compare 
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the views of men and women. Similarly, the young, the middle, 
and the old age groups have to be compared. And so on. How 


do we prepare for this? 


So far as sex is concerned, there is no problem. We have the men 
and the women. We arrange to prepare two sets of results, one 
for the men and one for the women. Then we present the results 
of this exercise in such a manner that the sub-categories of sex 
can be compared. The following is a typical format for this 


purpose. 


Sex Favour Depends Do not Favour Don’t Know Total 


response 
(100%) 
Males: N 
% 
Females: N 
% 


You can now compare how many men and how many women 
favour, how many men and how many women don’t favour, and 
so on. As the number of men who have been interviewed may be 
more than the number of women, the number of men in a cell 
may not be strictly comparable to the number of women in a 
corresponding cell. So the differences between cells due to this 
numerical difference must be neutralised. We decide to do this 
by computing the percentage of the two sub-categories. That is 
the number of men in each cell is converted to a percentage of 
all men. In the same way with women. Now the cell values become 
comparable. So the first statistics that you use is the ‘percentage’. 


But, you may point out: The difference in the percentage value 
between two corresponding cells is so small that one cannot say 
that this is a real difference. Maybe the difference is the result of 
chance. Can we be sure that men are really more (or less) 
favourable than women in their views on whether or not women 
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should work? To answer such questions, you can undertake statis- 
tical analysis using appropriate tests like the ‘Chi-square’ test. 


Usually, the decision to undertake statistical testing of your results 
will involve your having to take another kind of decision: How 
will you execute the appropriate statistical test or tests? The 
answer can range from ’manual’ computation (pencil and paper 
calculations) to computer programmed testing. Now, you will 
notice that as you move from the most simple manual (in fact 
most cumbersome) procedure to the most complex computerised 
procedure the need to know how to do the computation reduces. 
You don’t have to know the formulae, nor do you have to know 
the step by step procedures. 


Now, take the second example of age. If, as was hinted at the 
beginning, you want to compare the views of those belonging to 
the different age groups, you have to first make sure that your 
data will be in a form that would facilitate the comparison of the 
sub-categories of age. Thus, what are the age groups that you 
have in mind? You might say that you want to compare the young, 
the middle and the old age groups. Now, what do you mean by 
young i.e. what age group, Say upto 34? Similarly about the middle 
(say 35 to 54?), and the old (say 55+?). If you had already decided 
this is how you are going to classify your respondents in respect 
of their age you could have saved a lot of energy in your study 
by wording your question as follows: To what age group do you 
belong?: 18 to 34/35 to 54/55 or older? So you have to just obtain 
this information and, as in the example of sex, counted how many 
are in each age group, and then count how many in each age 
group are in favour, not in favour and so on. 


But, and this is a very important ‘BUT’, if, at a later stage, after 
seeing your results, you wish to reclassify the age groups so that 
the old will now include only those who are 64 years of age or 
older, and that the young should consist of those who are not 
older than 24 years of age, then you have no way of making this 
correction in your data. But if only you had anticipated these 
possibilities, you might have decided that it would be most helpful 
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for later analysis to initially collect detailed age data i.e., single 
year age data. 


So you see reviewing the kind of information you will collect and 
how you are going to summarise it and analyse it, is a vital 
planning task. If you do it, and there is no reason why you should 
not do it with the pre-tested schedule data, you will save a lot of 
heartaches later. 


In a similar manner you could plan for tabulations using the 
education, occupation and income data. Alternately, you can 
decide to use a combination of all these three variables by 
creating a ‘COMPOSITE VARIABLE’ named Socio-Economic 
Status index (SES). This is usually done because it is a common 
Observation that these three are closely associated with one 
another. The procedure for computing this composite variable 
is given below. 


Socio-Economic Status: This is computed on the basis of the 
respondents’ answer to questions relating to education, occupation 
and income. The following scoring system has been adopted to 
determine the SES. 


—_e—e—e—e—————————————————— — — — 


Level Education Occupation Income Rs. Points (each) 
Low upto SSC blue collar Upto 500 1 
Medium Inter/Degree clerical 501 -1000 i] 
Higher Post Graduate executive 1001 & more 


Since each respondent will be scored on all three variables, the 
minimum and maximum points that could be scored are three to 
nine. 


Thus the SES index would be: 


Low SES — 3 or 4 points 
Middle SES — 5 to 7 points 
High SES — 8 or 9 points 
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SECOND EXAMPLE: 
HOUSING SITUATION 


Very briefly, a careful review of the objectives will reveal that 
the primary analysis must be along two paths: First compare the 
present and the preferred in terms of each of the items that have 
been listed for data collection. 


Operationally, we have to find out for each sub-group of respon- 
dents (in terms of their current housing) what their preferred type 
of dwelling is. So you need to undertake cross tabulation of present 
and preferred housing. Your results must enable you to fill in the 
following dummy table. 


PRESENT AND PREFERRED DWELLING 


a ee 
Present Hut Preferred Dwelling Buwaiie Total 
dwelling chawl flat 6 100% 


Hut 
Chawl 
Flat 


Bungalow 


Total 


Be ce aU oe 
Secondly, and using the same procedure as above, prepare tables 
wherein you provide for the comparing of responses regarding 
present and preferred dwellings among respondents with different 
characteristics. 


SOCIO-ECONOMIC STATUS AND PRESENT DWELLING 
ene nT ei ee ee 
Present dwelling 


Total 
Hut Chawl Flat Bungalow (100%) 


Status 


Socio-economic 


Low N 
% 
Mbderate N 
% 
High N 


ace eal, Ep siete ae ee ie eh 
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Similarly, at the planning. stage, prepare a number of dummy 
tables to make sure that you will be collecting the data you need 
and will be utilising the data in a proper manner. 


Finally, to find out if there are real differences between sub- 
categories of the independent variables that are being compared, 
you may decide to use an appropriate statistical test. It will be 
useful to decide at this stage what tests you will be resorting to 
for your analysis. 


THIRD EXAMPLE: 
MEASURING CONSCIOUSNESS 


You will recall the concept of consciousness that was presented 
in Chapter One, Example Three. The procedures to be followed 
after data collection to arrive at the analysis stage is quite complex. 
In fact, I would advise that, if you can, you should read the 
original report carefully (Sce Preface for the reference). 


First, the response of each respondent to each and every question 
has to be scored according to the ‘position’ of the response in the 
‘magical to critical consciousness’ continuum. This varies from 
question to question and related responses. 


Secondly, the scores obtained by each respondent on each question 
have to be cumulatively added up to obtain the component score, 
the dimension score, and finally the overall consciousness score. 


Thirdly, the raw component scores, dimension scores and overall 
consciousness scores have to be categorised into three major 
categories, viz the magical level of consciousness, the naive, and 
the critical level of consciousness. 


Having obtained the above scores (raw values) and the level of 
achievement (categorised), the analysis design can be construed 
to be a simple one of merely presenting the results in terms of 
the distribution of community members in the different categories. 
At the same time, the average score obtained by the members for 
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each component, the dimension as well as the overall can also be 
computed. 


However, reading the objectives more closely, you may want to 
compare the community members who are also participants in the 
adult education programme with those who are not members. Or, 
you may prefer to have a more detailed analysis of sub-groups of 
respondents. For example, is there a real difference between men 
and women in their level of consciousness, are young adults 
different from the older ones in respect of their consciousness; 
and so on. But as you develop this analysis plan, cross check with 
the first stage of your project to make sure you have provided for 
the collection of data on the respondents’ characteristics. If you 
have not already included these in your list of variables/items of 
information, this would be the ‘last chance’ to revise your list of 
items of information and accordingly modify your interview 
schedule. If you miss this opportunity, well, it is just too late to 
do anything about it after the data have been collected. So the 
point here is do your analysis designing before finalising the plan 
for the study. 


Assuming you have the requisite data, characteristics included, 
you can now have a more detailed analysis design. You cannot 
merely compare (using appropriate statistical procedures) the rela- 
tive performance on- different dimensions, but also within each 
dimension, the relative contribution of each component. At each 
of these points, you can introduce the characteristics to find out 
whether there are sub-respondent group variations. 


FOURTH EXAMPLE: 
DCPs AND HUMAN DEVELOPMENT 


This example too can be viewed in two ways. First is the simple 
procedure of preparing, as with Examples One and Two, cross 
tabulations between each of the independent variables and each 
of the statements. Appropriate statistical tests can then be used to 
COMMUNITY HEALTH CELB 
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find out if there are real differences among sub-groups of 
respondents. 


The more complex procedure would be to follow the same prin- 
ciple steps as for Example Three. First, sort -out the different 
statements into pre-determined categories according to the opera- 
tional meaning of ‘Human Development’ (HD). For example, 
you could have the following categories: understanding of 
HD, approaches to HD, and commitment to HD. 


Having done the categorisation, you have to get the scores for 
each respondent on each of the categories, and you can convert 
the raw scores into appropriate groups like ‘low’, ‘medium’, ‘high’ 
levels of understanding of human development. 

The third step then would be to do the cross tables as stated earlier 


between each of the independent variables on the one’ hand and 
each of the components of human development on the other. 
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CHAPTER 6 


FIELD 
DATA 
COLLECTION 
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The fourth and final design in the strategy planning exercises iS 
to decide what should be the field procedure to be adopted for 
the actual collection of data and for this purpose also anticipate 
the problems that you are likely to face. In other words, we are 
now to consider how to organise the field work. The respondents, 
as we have already seen, are predetermined in sample surveys on 
the basis of the sampling design, and therefore, the investigators 
have to trace the respondents and interview them. 


This stage of a research study consists of the following two steps: 
Step One: Recruiting and Training the Field Investigators 


Step Two: Field Data Collection, i.e., canvassing interview schedules 
with the respondents who constitute the sample for the study. 


Each of these will be taken up for detailed consideration. 


RECRUITING AND TRAINING 
FIELD INVESTIGATORS 


The first question that we have to answer is: How many are to 
be recruited? The second question will be: How are they to be 
trained? Let us take the first question. 
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RECRUITING 


The number to be recruited will depend on the total number of 
interviews to be conducted, the location of the interviews, and 
how many respondents can be interviewed per day. Answers to 
these questions are available to you from two sources: the research 
plan, and especially the scope of the study (see Chapter 2), and 
the experience gained from the pre-test exercise. We shall assume 
here that information from both these sources reveals that the 
study is being done in a rural area, and depending on the season, 
on an average, one investigator can conduct about 7 to 8 interviews 
per day, with each interview being no longer than 35 to 45 minutes. 
Thus, the investigator will have about eight hours of work per day. 


A tule-of-thumb is now available to determine the number of 
investigators. Divide the total number of interviews to be con- 
ducted by the base of 7 to 8 to arrive at the number of days in 
which one person can complete the work of interviewing the total 
sample for the research study. But the longer it takes (in terms 
Of days) for the total sample, the more tedious and boring it 
becomes for the single investigator and he may tend to ‘bias’ the 
responses in a number of ways. At the same time, the longer it 
takes, the comparability of conditions of the initial and the later 
interviews may reduce. Hence, the optimum time has to be found. 
For rural areas such field exercises do not have to exceed one 
month. This is just a notional figure but changes from research 
study to study. The number of investigators cannot also be in- 
creased beyond an optimum number as variations among too many 
investigators may contaminate the results. For the present, it is 
sufficient to say that assuming one month (30 working days) would 
be the optimum period, divide the earlier number of days for one 
investigator by 30 to arrive at the total number of investigators 
that would be required. 


Let us take an example. The total number of interviews to be 
covered is 600. This divided by 7 results in 85.7 days (rounded 
to say 90 days). This divided by 30 gives an average of three 


Field Data Collection 105 


investigators. If the number of working days is just 20, then the 


number of investigators required for data collection would be: 
600/7/20 = 4. 


TRAINING THE INVESTIGATORS 


Having selected the required number of investigators, it is neces- 
sary to train them in interviewing in general and in the canvassing 
of the specific research study’s instruments of data collection in 
particular. Hence, even if they have been doing similar tasks 
earlier, they would still need training in respect of each specific 
instrument of data collection. Briefly, the training would consist 
of the following: 


1. Details of organisation for doing the study 

2. The objectives of the study and the research strategy 

3. The interview schedule to be explained question by question 

4. Mock interviews to be conducted among the investigators to 
ensure that they understand the questions and what would 
constitute relevant answers. Retraining as required 

5. . Field trials of interviews to be conducted. Correctives to be made 

6. Explaining the sample design and procedure for selecting 
respondents 

7. Providing training on actual interview procedures beginning 
with location of respondents and including the establishing of 
rapport, opening the interview, conduct of interview, golden 
rules for canvassing interview schedules, and closing the 
interview 

8. Anticipated field difficulties and corrective steps to be taken 

9. Checking the completed interview schedules 

10. Informing the investigator about the mode and rates of pay- 
ment for interviewing work. 


DATA COLLECTION 


The data collection step consists of the following sub-steps: 
(A) locating the address of residents 
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(B) meeting the respondents 
(C) establishing rapport 

(D) interview proper 

(E) closing the interview. 


We shall now discuss each in detail. 


LOCATING THE ADDRESS 


The first problem that the investigator faces is that of locating the 
address of the respondent. It is possible that some time has been 
spent locating the area, road and building in which the respondent 
lives. Where the numbering of house is in order, the location of 
the respondent’s home is not a difficult task. But if the numbering 
is not in sequence, then much time is spent in locating the 
residence. The field worker should be particularly careful not to 
get discouraged and attempt to substitute it by another respondent 
who is easily available. This warning has to be seriously heeded 
in sample surveys because the unwarranted substitution of cases 
introduces bias. 


How does one go about finding the right address? There are two 
procedures. In the first place, it is advisable for the investigator 
to spend a day or two going round the assigned localities and 
familiarising oneself with the areas. Landmarks should be noted 
for future reference. If possible, each worker or a pair of workers 
should be allocated to one zone so that they can be thoroughly 
familiar with that particular area. There is every likelihood that 
even this may in some cases prove inadequate. In such cases, the 
second course of action is to contact the local postman, the 
municipal rent collector, the electrician who reads domestic 
meters, etc. 


MEETING THE RESPONDENTS 


Once the residence has been located, the next step, a comparatively 
difficult task, is to contact the respondent concerned. When her 
residence is located, one of the two following situations are 
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possible. The respondent may be either away or at home. If she 
is away, then it is necessary to ascertain when she will return and 
arrange for an appointment. In case the house is locked, then the 
same information about the respondent may be sought from the 
neighbours. If even this is not possible, it would be necessary to 
return later. 


If the respondent is at home, she may either be too busy to spare 
the time for an interview or alternatively free to talk to the 
investigator. In the former case, it would be necessary to make 
an appointment for the convenience of the respondent and later 
return for the interview at the appointed time. If she is free to 
talk, then you proceed with the interview. 


ESTABLISHING RAPPORT 


The first thing that the interviewer does on meeting the respondent 
is to introduce herself and explain the purpose of her visit and 
the aims of the study and seek her co-operation. It is, therefore, 
necessary that the interviewer be able to convince the respondent 
of the importance of the survey and that it is worthwhile for the 
respondent to share some time and energy in answering the 
questions. Thus, the respondent herself must feel that she is 
making some contribution, she must feel acquainted with the 
interviewer and be friendly towards her. If the respondent readily 
agrees to be interviewed, then the investigator will proceed with 
the interview. If however, she is antagonistic to the interviewer, 
it is the duty of the investigator to do the needful in the matter 
convincing her. 


During this time, the investigator has also the additional task of 
establishing rapport with the respondent. It is imperative that the 
investigator informs the respondent of the type of information 
being asked for, and at the same time, assure her that the given 
details will be strictly confidential. She, will also indicate the 
means adopted to ensure such anonymity. It is a frequent ex- 
perience that the respondent may ask many questions, including 
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the type of answers given by other respondents, the utility of such 
studies to her etc. The investigator should give frank answers, and 
yet not disclose information which is not called for or is con- 
fidential. In general, this implies that she must in this short time, 
establish a relationship of friendliness, confidence and trust that 
provides the foundation for good interviewing. She should clear 
any suspicions the respondent may have and ensure his co-opera- 
tion. The establishment of such a rapport will depend, among 
other things, on three factors: 


(a) the interview situation: the physical environment, the location 
of the place of interview etc. 

(b) the interviewee: in her temperament and moods, status and 
outlook. 

(c) the subject of interview. 


It is needless to mention that it also depends on the interviewer 
herself. She should not assume a role of the moralist, the ‘know 
all’, the quiz master or a cross-examiner. 


It is normally expected that after this preliminary introduction, 
the respondent would be most willing to co-operate in answering 
the questions. Unfortunately, this is not always so. Problems in 
this area are mainly of two types: 


The first is the problem of the individual respondent, the second 
is the problem of faction groups. Sometimes, a respondent does 
not extend her co-operation to the interviewer. She may either 
totally refuse or agree to answer some questions only. The reasons 
may vary from respondent to respondent. But in general, it may 
be due to fear of how the information will be used or whom it 
will be ultimately given to. Some may be disinterested in studies 
as such and may consider it a waste of time to talk to the 
investigator. It is necessary in such cases to try and find out why 
the person is refusing to talk. On the basis of this information, it 
would be possible to convince her of the importance of her 
participation. Where the interviewer herself fails to elicit the 
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necessary co-operation she may be able to get a person known to 
the respondent to talk to her and seek her co-operation. 


This second procedure of getting another person to talk to the 
respondent may be difficult as one has to go in search of such a 
person. Thus, it will be left to the interviewer herself to use her 
own wits to tackle the respondent. Quite often, after repeated visits 
to the respondent, it may finally be possible to get her to talk. 
But even if after these many visits, she still refuses to participate, 
then the investigator reports to the supervisor who will then meet 
the respondent and try to establish rapport. We may now discuss 
some of the reasons for initial non co-operation and how they 
may be tackled. 


Disinterested in Study 


During a study undertaken in a medium-sized town, a respondent 
refused to give information though she was approached a number 
of times. But everyday, the investigator used to visit her and talk 
about different topics. The aim here was first to become a good 
acquaintance. In the end, the investigator could get the informa- 
tion, though not in a single day, but after another two to three 
meetings. 


Identification with Revenue Officials 


A prominent medical practitioner fixed an appointment for the 
interview. He was rough during the first contact, but finally 
observed that the investigator was really doing very good social 
work. The interview then started smoothly with all the details. He 
responded to the item on income which ran into lakhs of rupees. 
Unfortunately for the interview and me, the respondent’s wife 
came in at this time. She was a partner of the respondent in the 
dispensary maintained by them. She called him inside and told 
him about some income tax difficulties (The investigator could 
not help overhearing the conversation). The respondent then came 
out and told the investigator, frankly and bluntly, that he could 
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not proceed with the interview. He did not need any arguments. 
It was impossible to tackle him at this stage or at a later stage. 
The rapport could not be re-established. 


Fear of Wrong Identification 


In the red district of a town, a lone middle class family of a grocer 
was selected for interview. On the first day, he refused to give 
information despite my explaining to him all the concepts apart 
from the objectives of the survey. His two main prejudices — that 
the investigator had visited a number of low class and ‘addict’ 
families and that he was not an addict, were overcome. Then, he 
got a new idea that the investigator was likely to create income 
tax trouble. Twice more, the investigator met him on this point. 
Neither could he select a similar family in the locality because 
he was instructed not to substitute the family by another. By this 
time, the informant sought his lawyer’s advice and the latter asked 
him to refuse to give information. On a later occasion, when the 
investigator met him, he wanted to throw the former out of the 
house. 


The investigator was simply adamant and told the person that he 
never accepted a refusal. After an hour’s heated argument and 
continuous persuasion, the investigator got the requisite informa- 
tion. It hardly needs to be stated that this is not a good procedure 
for data collection. 


Anti-Charity 


At one place, the investigator was welcomed with mango skins. 
The respondent’s wife did not open the door. The investigator 
wanted to take an appointment, but she refused to answer any 
questions. The investigator then started talking to her neighbour 
and within five minutes, she came out of her house asking the 
investigator what she wanted and the purpose of her visit. She 
said that she thought that the investigator had come for some fund 


Field Data Collection 111 


collection. The latter told her about the study. She then apologised 
and willingly gave the required information. 


The second type of problem that is faced in the stage of data 
collection is that of group factions. This is of a more serious 
nature as it tends to effect the study very badly. Such factions 
which exist in different localities oppose each other on any issue 
that involves the community at large. In such a situation, if the 
researcher is accepted by one of the factional groups, there is 
every likelihood that she may be shunned by the other and this 
antagonises the community as a whole. In such an event, it would 
be advantageous to the researcher to understand the conflicts 
thoroughly, not necessarily with the intention of solving it or 
getting involved in it, or even tackling it, but to steer clear of it, 
if it is so possible. In such cases, it is necessary to seck local 
leaders respected by both to reconcile the groups with regard to 
the survey. 


Rapport Techniques 


Some of the more common techniques for establishing rapport as 
well as maintaining good relations throughout the interview, may 
now be discussed. It needs mentioning, however, that hot all of them 
need to be applicd always. The technique to be utilised depends on ~ 
the type of respondent. 


Free Airing of Views 


This involves patient listening on the part of the interviewer and 
permitting the respondent to give vent to her feclings, opinions, 
etc. After the respondent has morc or Icss ‘exhausted’ herself, the 
interviewer may be in a position to assess her suspicions and gain 
her good will. On one occasion, a respondent when approached, 
went into a tirade against all and sundry, talking of her difficulties, 
the futility of surveys, etc. Then the interviewer told her that she 
was glad that the respondent had positive opinions on many 
subjects. Hence, it may not be too much for her to answer some 
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of her questions as well. Therefore, the first and foremost thing 
to be remembered by an investigator is that the respondent should 
not be antagonised at any stage of the interview and it becomes 
necessary to give a patient hearing to what she says. No doubt, 
time is an important factor and this technique cannot be given 
indefinite lease. 


When we open the subject of interview, the respondent might 
have something interesting to say on it but in doing so, she might 
go astray. If necessary, we may even sympathise with the respon- 
dent, thus ensuring that in due course she may get the necessary 
information as the respondent feels bound to furnish the same as 
we have patiently listened to her talk. During the talk she can 
also be diverted from what she said earlier, to the interview proper 
and such instances are not few. The following will illustrate this 
point. 


During the course of her interview of a dockyard worker, the 
investigator asked him his wages and naturally, he began saying 
that these are inadequate as the prices were soaring up. Among 
other things, he complained about bad working and housing 
conditions. Immediately, the investigator asked him whether good 
working or housing conditions would move him to another job. 
This question was, of course, put along with other alternative 
incentives to change the job and the respondent gave the answer. 
If the investigator had then told the respondent that his talk 
regarding high cost was not of any use, he would have felt hurt 
and probably refused to participate further in the interview. 


I stated earlicr that this technique has its limitations and if the 
respondent gocs totally out of the point, it should be tactfully 
suggested to her to answer specific questions. 


Identification 


Here, the aim is to establish some common ground with the 
respondent which would enable the establishment of rapport. 
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Essentially, its purpose is to assure the respondent that the inves- 
ligator is no stranger. This has been found particularly useful in 
Indian conditions. For example, in areas where people come to 
live from different parts of the country, the fact that the inves- 
tigator is also from their place of origin usually tends to establish 
a bond. Again, in rural areas, when the investigator speaks the 
same dialect or knows the local customs and manners and behaves 
accordingly, good relations are quickly established. Items of iden- 
tification may range from common place of origin and language 
to common sufferings in similar situations and experiences. 


Complimentary Remarks 


Flattery is a weapon which can be gainfully utilized or misused 
with deterrent results. This should be used with sincerity and with 
a genuine object, like comments on the children, calling by the 
popular designation in the informal sctting, etc. 


A few other techniques are given in the earlicr examples on initial 
non-responsces. 


THE INTERVIEW PROPER 


Once rapport has been established, the next step is to proceed 
with the interview proper. The first stage here is to get the 
respondent to relax, to asscss the Ievel at which the interview 
should be conducted, the gencral ability of the respondent to grasp 
the meaning of questions, etc. As regards the gencral behaviour 
of the investigator throughout the interview, the following should 
be carefully avoided. 


(a) Giving an impression of supcriority over the respondent 

(b) Intimidating the respondent into answering questions, par- 
ticularly those acceptable to the investigator’s viewpoint 

(c) Asking leading questions 

(d) Answering questions on behalf of the respondent, especially 
when the latter is cither groping for an answer or, in order to 
avoid answering the questions put by the investigator. 
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The investigator should, however, adopt the following procedures 
during the interviews: 


(a) Probe into the answers of the respondent when they are vaguc, 
contradictory or inadequate 

(b) Encourage the respondent to speak, by way of proper remarks, 
when he seems lost or unable to verbalise his opinion and 
feclings 

(c) Appear sympathetic to the respondent while .not identifying 
oneself with his views 

(d) Take down verbatim the answer of the respondent, except 
when alternatives are already included in the schedule. In the 
latter case, care must be taken to tick off the proper answer 

(c) The questions should be asked in the order in which they are 
given in the schedule. However, if the need arises because 
the respondent is unable to answer a question immediately or 
if it is felt that some later question may be asked before a 
particular question is repeated, a change in sequence may be 
permissible. 


A problem that you may face during interviewing is that of 
language. Quite often, though you may know the language of a 
given arca, you may not be familiar with the local dialect. 
Variations in the way of speaking a language from region to 
region, is something that cannot be easily dealt with. Hopefully, 
you are aware of this before you start your data collection, and 
take steps to get an investigator who knows the language and the 
dialect. 


A second problem pertains to the actual responses. Often the 
question on age docs not elicit the correct answer, cither because 
the respondent likes to give a different age or because she docs 
not know the correct age. In the former case nothing much can 
be done but in the latter case, she may be helped to find the 
correct one by referring to some landmarks in the calendar of the 
place, like main religious events, important happenings, etc. 
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CLOSING THE INTERVIEW 


Towards the end of the interview, it is helpful to let the respondent 
anticipate the closure of the interview by commenting, ‘Now, just 
this last question’, or something to that effect. The stage is then 
set for the interviewer’s departure. The interviewer can also signal 
her departure by folding her schedule, standing up or moving 
towards the door. The respondent should be thanked for her help 
and co-operation in the survey and be left with a feeling of 
Satisfaction for having co-operated in a worthwhile undertaking. 
In cases where it is particularly difficult to leave, the suggestion 
that the interviewer has quite a few calls to make and while she 
would like to discuss the matter further she just cannot, will 
usually lead to a quick and diplomatic departure. 


NON-RESPONSE 


Usually non-response means that the investigator has tried to 
contact the respondent who is in the sample, but has not been 
able to interview her. Hence, non-response may arise because of: 


(a) Inability to contact the respondent at all because she is away 
from town; in town but is not available at home, deceased, etc. 
(b) The respondent is contacted but refuses to be interviewed. 


In cither case, it may be scen that the sample quota tends to fall 
short of the total by the number of non-responses ‘obtained. 
Usually, the question is asked: Should all non-respondents be 
substituted by other respondents? The answer to this question 
depends among other things, on the type of study and the sample 
design. 


CHAPTER 7 

a ee ee 
DATA 

PROCESSING 


[ere 7 


With the collection of data the field work stage of the study 
concludes. The next step is scrutinise the collected data. This is 
preferably done in the ficld itself so that, if necessary, you can 
go back to the ficld to fill in gaps in information, to verify doubtful 
information, and even get additional information which may throw 
more light on the problem. We now discuss each of the following 
Steps in detail. 


A. Scrutiny and editing of forms 
B. Classification of forms 
C. Coding 


SCRUTINY AND EDITING 


This step forms a very important part of the processing of data 
and if done well, saves time, energy and confusion at a later stage 
of work. 


What should one look for while scrutinising the data collected? 
These are classified into four headings: 


(1) Completeness of forms 
(2) Relevance of responses to the questions 
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(3) Internal Consistency 
(4) External Consistency 


COMPLETENESS 


By this is implied that each question in the schedule must have 
a response recorded against it. No question should be Ieft blank. 
If the respondent has not answered a particular question, you must 
record that fact and the reason for the non-response against the 
question. 


RELEVANCE 


Where responses have been made to specific questions you should, 
in checking the response, make sure that the answers recorded 
are relevant to the question. For example, Marital Status, if the 
response recorded is ‘Good’, then it is a wrong response. What 
is required is the actual status, viz., married, unmarried, divorced, 
widowed etc. 


INTERNAL CONSISTENCY 


By internal consistency is implied, that there should be agreement 
in the responses recorded against similar questions placed in 
different parts of the schedule. For example, if in the early part 
of the schedule a question on age was asked, and later in the third 
quarter, questions relating to the employment period were also 
asked, a comparison of the two answers should attest to the 
reliability of each answer. For example, suppose that the respon- 
dent says she is 35 ycars of age. But she also says that she has 
been employed in a number of different jobs for different years; 
when you add up the years worked on different jobs, you may 
find that she has been working for a total of 40 years. Obviously, 
she could not have been working before she was born. So you 
have to probe the question of age as well as employment history 
to arrive at the most approximately correct answer. 
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EXTERNAL CONSISTENCY 


The check here is whether the responses are in keeping with our 
knowledge of a given situation as obtained from other sources. 
No doubt, there are occasions when there may not be any con- 
sistency. In such cases, it is necessary to establish the reasons for 
the difference and it is essential that it be objectively ascertained 
whether it is a possible variation. 


For example, a respondent may say that his first job was with the 
army which he joined as soon as the Second World War started; 
which he claims was in 1940. In fact the Second World War 
started in 1939 in Europe and in 1941 in Asia. The question is: 
which is the War theatre he is referring to? To answer this you 
have to find out which army he joined; where did he join it; in 
which ‘theatre’ did he participate and so on. 


Instructions for the scrutiny and editing of forms must be prepared. 
The following examples will illustrate the instructions. 


Code Serial Number: The Code Serial Number of cach form 
consists of five numbers, the first relating to response to the 
question: Should women work, the second to the SES, the third 
to education level, and the last two to the scrial number within 
the sub-category. 


Q.2 Sex: Make sure that M or F is clearly written. 


Q.3 Religion: This is recorded as Hindu, Buddhist, Christian, 
Jain, Jew, Muslim, the space provided for ‘caste’ will be N.A. 
(not applicable). If religion is recorded as ‘Hindu’, then the caste 
will be recorded. 


Q.7 (a): The amount stated should be equal to or more than the 
amount shown under the last item in Q.7: (b) ‘Total Rs.’ If such 
answers as ‘Rs. X’ + ‘20 mounds rice’: ‘one kandy rice’ etc., are 
recorded, convert the camings in kind to cash values at the 
following rate: 1 mound = Rs 12, 1 kandy = 20 mounds. After 
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the conversion to cash value, divide the amount by 12 and add 
the answer to the previously recorded figure. 


Example: Q.7(a): Rs.145+20 maunds of rice (20 * 12=240/12 = 
Rs.20 p.m.) i.e., Rs.145+20 = Rs. 165 p.m. 


Q. 7 (b): Check the total income of the respondent. 


QO 7 (c): Here only information about the respondent’s other 
eamings should be recorded. In case other information has been 
recorded, it should be cancelled. 


CLASSIFICATION OF FORMS 


Once the data has been checked and the relevant additions, 
deletions and corrections made, the next optional step is to classify 
or categorise the forms into some homogencous groups appropriate 
to the objectives of the study. It is optional because preliminary 
classification of the schedules has to be done only if you intend 
to manually process your data. In computer-based data processing, 
you need to consider classification of respondents only after all 
the data have been entered into the computer ‘data file’. 


There is no fundamental difference between manual and computer 
data processing. In manual data processing we try to reduce this 
physical work as much as possible by anticipating the data analysis 
that will have to be done. In fact, in the light of the analysis 
desigh that was already prepared, it is possible to take some ‘short 
cuts’ without introducing any errors. 


A question you may wish to ask here is this: If computer data 
processing is faster and more efficient, why do you have to learn 
manual procedures? The fact is, that you have to instruct the 
computer to do the necessary tasks for you, i.e., the same steps 
as you would take in manual processing. It is for this reason that 
you should have the necessary knowledge and skills in data 
processing. This is best learnt through manual procedures. 


We now start with a detailed description of the manual procedure. 
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Classification means that the interview schedules of respondents 
are initially physically sorted by some characteristics which are 
to be used most frequently for analysis later. For example, if the 
objective is to study the family budget, it is advisable to initially 
classify respondents into a few income groups. In the example, it 
is anticipated that the pattern of expenditure, savings and assets 
of the different income groups would differ from one another. 


If you find it useful, you may further classify the respondents 
according to a second characteristic. In the example just given, 
you may wish to further sub-classify in terms of rural and urban 
residence of the respondents. So, within each income group you 
have two sub-groups. 


Here, an important question arises: How many levels of classifica- 
tion are possible? The primary condition is that there should be 
at least a few respondents in each group. There is no point 
sub-classifying if there is going to be only one, two or, at the 
most, five respondents in a group. It also depends on the kind of 
Statistical analysis that is to be made. But the rule-of-thumb is to 
have about 10 per cent of the total in each group. 


Let us consider an example here. This is from the Women and 
Employment study that you are quite familiar with by now. Let 
us assume that the analysis design for this study anticipated that 
the initial analysis will be to compare the responses of different 
‘socio-economic status’ groups and subsequently, it will be around 
the education level (but within each socio-economic Status), and 
the marital status of respondents. 


This being so, the question is: How do you reduce the manual 
work involved in the manual processing, keeping in view the need 
to undertake quite some analysis. You can plan your work along 
the following lines. First, physically sort all interview schedules 
into three piles. The first pile will consist of forms of all respon- 
dents who said that women should work. The second pile will 
consist of forms of all those who were undecided. The third pile 
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will contain the remaining forms wherein the response was ‘should 
not work’. 


Since the primary tabulations will be around the socio-economic 
status of respondents, and assuming that you have already recorded 
in the interview schedule the respondents’ status, you take the 
second step in processing. Take the first pile of “SHOULD WORK’, 
and re-sort it into three piles. The first pile here will consist of 
the LOW SES respondents. The second pile will consist of those 
from the MIDDLE SES group. The third pile will consist of the 
‘shoulds’ who have HIGH SES levels. The same can be repeated 
for the other two major piles and this is what you will get: 


SORT STAGE: 1ST 2ND PILE NUMBER 
DEP. VARIABLE 1. SHOULD 1. Low SES . 11 
2. Mid SES 12 
3. High SES 13 
2. DEPENDS 1. Low SES 21 
2. Mid SES 22 
3. High SES 23 
3. SHOULD NOT 1. Low SES 7 31 
2. Mid SES 32 
3. High SES 33 


Now, you are ready for the third level sorting which is in terms 
of education. Let us assume here that the classification of respon- 
dents will be in terms of those who have not been to college (the 
non collegiates——NCs), and those who have been to college 
(collegiates — Cs). The third level sorting will then yield the 
following ‘chart’. In order to complete this task let us make just 
one more assumption: viz, that there are between 20 and 25 
respondents in each sub-sub-category. This will enable to take the 
final step in classification i.e., numbering the schedules in a 
manner to retain each one’s identity. 
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SORT STAGE: 1ST 2ND 3RD SCH. Nos. 
111 20 
2S 112 01 TO 
112 23 
2. Mid. SES 1. NC 121 01 TO 
121 25 
[TE 122 01 TO 
122 24 
3. High SES’ 1. NC 131 01 to 
13225 
3 AB 132 01 to 
132 23 
and so on till the last pile 
3. SHOULD NOT 1. Low SES_- 1. NC 311 01 TO 
311 20 
we 312 01 TO 
312 20 
2. Mid, SES 1.:NC 321 01 TO 
321 23 
‘ae & 322 01 TO 
322 20 
3. High SES. 1. NC 331 01 TO 
331 22 
y AB, 6 332 01 TO 
332 20. 


You will notice from the above listing that without much difficul- 
ty, and by just adding up the relevant numbers, we can get the 
distributions for the following: 


1. The number of persons who said: 
Should (schedules 111 01 to 132 23) 
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Depends 
Should Not (schedules 311 01 to 332 20) 
2. The number who belong to: 
Low SES, Middle SES and High SES 
3. The number who are: 
Non-collegiate, Collegiate 
4. The number who are: 
in LSES and said Should 
in LSES and said Depends 
in LSES and said Should not 
and so forth 
5. The number who are: 
in LSES and Non-collegiate 
in LSES and Collegiate 


CODING RESPONSES 


As the word implies, coding aims at assigning numbers (numerics) 
or letters (alphas) to responses to a particular question to facilitate 
tabulation, summarisation and analysis of the data. In principle, 
the answers are first classified into homogencous categories and 
then, each category is given a code number. When tabulation is 
carried out, the actual response is represented by its code and then 
tabulated. 


NEED FOR CODING 


The question now arises as to why coding is necessary and under 
what circumstances should coding be resorted to. First, coding is 
the intermediate step to tabulation. It permits easy handling of 
data. It also introduces uniformity in the categorisation of respon- 
ses and later, analysis and presentation of results. 


However, coding is not resorted to in every study. Where the 
study is on a very small scale, and the types of questions asked 
are very limited, then coding may not be essential. 
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WHEN TO CODE 


We now come to the question of when coding should be intro- 
duced. There are three possibilities. 


(a) in the schedule or questionnaire, i.e., precoded questions 
(b) by the investigators soon after completing the day’s interviews 
(c) after all the data collection has been completed. 


We can discuss each one of these. 


(a) When the questions are close-ended, i.e., the responses are 
also listed against each question, these responses can be coded 
in the schedule itself. Then the appropriate response code can 
be ticked off as the respondent answers the questions. 

_ While this has the advantage that it does away with coding 
later and thereby saves valuable time and personnel, its chief 
defect is that there is no possibility of later checking out either 
the work of the investigator or the reliability of the response. 
Alternatively, instead of precoding the questions, it may be 
possible on the basis of the pre-test of the schedule to equip 
the investigator with ‘code book’ containing the codes to each 
question. This again suffers from the same defects as (a) 
above. 

(c) The next best procedure would be to have the responses coded 
after the data have been collected. Here, uniformity in coding 
is ensured and bias in selection of responses and categorisation 
are reduced to a minimum, if not eliminated altogether. How- 
ever, in qualitative answers, the spirit behind the responses 
may be lost by the coders. 


(b 


— 


HOW.TO CODE 


The coding to be used for different responses depends on the type 
of questions that may be asked. As scen in the earlier Chapter, 
questions may be broadly classified into three categories, viz., 
closed-questions, semi-open questions and open questions. In the 
closed questions all the possible alternatives are given, in the 
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semi-open questions, although alternatives are given, yet in an- 
ticipation of responses which may not be predetermined, a 
response ‘others (specify)’ is included. In the open question no 
alternatives are given and the respondent’s answers have to be 
taken down verbatim. We shall now look at each one of these. 


A very important point to remember is that it is most desirable 
when using electronic data processing facilities to resort to 
‘numeric’ codes. In manual processing, however, ‘alpha’ codes 
may be resorted to retain identification of the variables and the 
response categories. But given a choice one should prefer 
numerics. ; 


CLOSED QUESTIONS 


It may be repeated here that the alternatives provided are not only 
exhaustive of all the possible answers, but also mutually exclusive. 
For example, to the question ‘what is your marital status?’ the 
responses Married/Unmarried/Widowed/Separated/Divorced are 
exhaustive and mutually exclusive. The coding here is to allot to 
each response a code number or letter viz., 


Married — forM Unmarried —2orU 
Divorced =~ 5 OF Le Widowed —4or W 
Separated —SorS 


It will be seen that codes can be given serially or by using the 
first letter of the responses. In hand tabulation, it is better to allot 
the first letter as it reduces the strain on the memory and task of 
coders. Further, when responses are qualitative, the code is usually 
in the form of the first letter of the key word. Similarly, in 
quantitative responses, the code may be a number or a number 
and a letter. For example, to the question ‘what is your age’, the 
response may fall in one of the following categories viz., 


90-24/25-29/30-34/35-39/40-44/45-49/50-54/55-59/60-64, in 
which case the codes would be: 
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Age in Years Code 
20-24 1 or 2A 
25-29 2 or 2B 
30-34 3 or 3A 
35-39 4 or 3B 
40-44 5 or 4A 
45-49 6 or 4B 
50-54 7 or SA 
55-59 8 or 5B 
60-64 9 or 6A 


It will be seen in the above example that: 

(a) the age groups are mutually exclusive 

(b) the categories are considered exhaustive if the scope of the 
study is confined to those between 20 and 64 years of age 

(c) two codes have been provided: for each response category. 
The first is a serial numbering of each category of answers. 
The second has the additional advantage, in that the response 
itself provides the clue to the code. Thus, any person who is 
aged x0 to x4 has a code xA and a person aged x5 to x9 has 
a code xB. Thus, a person of age 74 would be 7A and one 
of 79 would be 7B. 


OPEN QUESTIONS 


The responses to open questions being ad verbatim, it is obvious 
there may be a large variety of qualitative answers. In principle, 
however, the procedure would be to study each response and 
prepare categories of answers. An example would illustrate this 
point best. 


In a study on the attitudes of lady students to the subject of women 
and education, one of the questions asked was whether they would 
work after getting married. The responses varied widely covering 
many aspects of the problem. A perusal of the responses, however, 
gave the first clue to categorising the responses. 
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Responses : Code 
(a) Those in favour of working after getting married ¥'or 3 
(b) Those in favour subject to certain conditions 

being fulfilled (conditional) Can 2 
(c) Those who were undecided U or 1 
(d) Those against working after marriage N or 0 


On the basis of this, it may next be possible to sort out all 
responses according to these categories. The next step would be 
to study the responses in each category and further sub-classify 
them. For example, taking the first one, i.e., those in favour, it 
may be possible to prepare sub-categories in terms of: 


Responses Code 
1. Economic reasons Y—Eor 11 
2. Knowledge acquisition and application Y —K or 12 
3. Career viewpoint Y —C or 13 
4. Family viewpoint Y —F or 14 
5. Personal factors Y —P or 15 


Similarly, each of the other categories may be sub-categorised by 
studying the responses and allotting appropriate codes. 


It may be mentioned here that while qualitative answers which 
are short and clear cut can be easily categorised and coded, long 
answers with many conditions and qualifications are more difficult 
to code. It could sometimes happen that a respondent may 
enumerate more than one answer or qualify the answers with more 
than one condition. In this case, it may be necessary to determine 
which is the more important of the ideas being stressed so that it 
may be categorised. It is not possible to provide for all the different 
responses. The degree to which it may be done would depend on 
the objectives of the study itself and how the data are to be finally 
presented. 


In order to facilitate proper categorisation and coding of open 
questions, the usual procedure is to take a sample of the schedules 
or questionnaires which have been completed and enter the respon- 


128 Survey Research For Social Work 


ses of each open question of each schedule into a card. Later, 
these cards are rearranged and formed into categories and coded. 


The procedure for semi-open questions is more or less similar to 
that of open questions. But here, it is simplified because there are 
very few responses, and the categorisation pattern is more or less 
laid down by the closed part of the question. Initially, it is 
necessary to have the categories as detailed as possible. 


CODE BOOK 


Considering the fact that in the initial processing we are using 
‘codes’ to represent different responses, and most often we use 
the same ‘codes’ to represent different answers (to different 
questions of course), it would not only be useful but necessary 
and very often inevitable that we keep a record of what response 
to which questions has been represented by what ‘code’. 


To illustrate the point, code ‘1’ could represent the response ‘yes’. 
It could also represent the response category ‘young’. It could 
also mean ‘unmarried’. Of course, as the questions change, the 
meaning of code ‘1’ also changes. But it would be necessary to 
either keep the codes in memory (assuming it never fails) or 
preferably in some record. 


In fact, this is most important when numeric codes are used in 
both manual and electronic data processing. In the latter, we can 
prepare a ‘directory’ of responses and their related codes and 
require the computer to ‘decode the codes’ every time it reports 
the responses. The record in which all this information is kept is 
called a CODE BOOK. A typical code book has the following format: 


Qn. No. Col Question Code: Response 
— 1-4 Serial No. Actual (as in schedule). 
1 5 Sex 1...Male 
2...Female 
2 6-7 Age Actual 


and so on. 
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First, for a couple of comments. Column one is self-explanatory. 
But what is the second column? It refers to the column number 
in which the response is to be recorded (and so will be found 
later) in a CODE SHEET (which is the sheet in which you will be 
recording or entering the codes). So when the information for the 
first line is 1-4, it means that the schedule serial number (which 
is a four digit number in this example) would be recorded in 
columns 1,2,3,4. (e.g., 0001,0002....0120. 0121... 1123,1124 and 
SO on), that is ONE DIGIT PER COLUMN. The third column in the 
code book is also self-explanatory. The fourth tells you what 
‘numeric’ to enter when the response is as. listed. Thus, if it is a 
male, code 1 will be entered in column 5. Similarly, age 24 will 
be entered in columns 6-7 (i.e., you will enter 2 in column 6 and 
4 in column 7 and so on). 


You must know why it is normally necessary to follow such a 
rigid principle for coding and recording data. This will help the 
next step of processing called tabulation. If you are using a 
computer for your work, you can tell the machine where your 
data are recorded and every time you call for it, it will go straight 
to that ‘address’ and never make a mistake. 


TABULATION 


Tabulation is the procedure adopted to summarise the collected 
data into some convenient and manageable form. It enables one 
to dispense with the schedules, which are cumbersome to handle. 
It is, therefore, the step between data collection on the one hand, 
and analysis on the other. Tabulation can be a one step or a two 
step procedure. In a one step procedure, ‘you can get your sum- 
marisation simultaneously with the process of tabulating or 
transferring the information from the interview schedule to the 
tabulation sheet. 


But in the two step procedure, you will first have to transfer the 
data from the interview schedules to a convenient ‘storage place’ 
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and take the step which is literally or figuratively the same as the 
one step procedure. This ‘storage’ place could be a sheet or sheets 
of papers or a computcr memory device like the hard disk or the 


floppy disk. 


In computerised tabulation, the data in the schedules are coded 
and initially transferred through an appropriate ‘software’ into a 

computer memory device. These stored information are then sorted 
through another software and tabulated. More of this will be 
presented later. 


In manual or hand tabulation, work is manually done by a person 
or persons, using no mechanical devices. Hand tabulation may 
follow the principles of machine tabulation, either wholly or 
partially, depending on whether the data are coded or not. This 
will be discussed in greater detail later. 


The one step tabulation procedure and the second step of the two 
step tabulation will be taken up in the next chapter. But here, we 
continue with the first step of the two step procedure as it will 
then bring on par both the procedures. 


The third technique of tabulation is the ‘master sheet’. In principle, 
while the procedure is similar to that of the ‘tally sheet’ method, ~ 
here, only one column is assigned to one question and the alter- 
native answers to a question are precoded and the code numbers 
entered in the appropriate column. An example is given below. 
The response to a question, marital status and economic status 
are: 


Marital status Code No. Economic status Code No. 

_ Married l Earner l 

Unmarried z Earner dependent 2 

Widowed a Dependent 3 
4 


Divorced 
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Then the master sheet would be as follows: 
Sch: No- MS. BS -AGE .SEX OCC IND INC 


001 1 1 34 1 3 2 1650 
002 2 2 42 1 5 4 2250 
003 1 3 24 2 1 5 1000 
004 2 3 18 1 1 3 750 
005 3 2 56 2 8 6 5800 


The main feature of this procedure is that as far as possible all 
the questions are included in just one sheet, and as many schedules 
as possible are included in the same sheet. Further, it enables one 
to see tendencies at a glance. 


It is useful and practicable for large studies, unlike the tally sheet 
method which is useful for small studics using hand tabulation. 
It is also the most convenient procedure in undertaking cross 
tabulation. 


In all the above techniques except the tally method (where in 
place of a tally mark against the appropriate response, the number 
of the schedule is put), one row is allotted to each schedule. But, 
where a detailed tabulation of a household in terms of some 
demographic data has to be made, a row is allotted to each member 
of the family. It would be as follows: 


Sch. No. Serial No. Rel. to head Age Sex M.S 
of family pa = ae 

1 1 Head 50 M M 
2 Wife 45 F M 

3 Son Zo M U 

4 Daughter 20 F W 

2 Daughter 15 F U 

6 Son 10 M W 

z 1 Head 45 M W 
Z Son 20 M U 

3 Daughter 15 F U 
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This provides for detailed age-sex specific analysis. Out of these 
tabulated data frequency distribution may be obtained in the first 
instance. 

So with the above, we complete the first few steps in data 
processing. However, it would be useful to have a word about 
computer usage. It would be unfair in this computer age to pretend 
that you will not be using it, or more unfortunate, to assume that 
you should not use it because you must ‘learn the rudiments’ the 
hard way before you take the ‘easy path’. Therefore, here is some 
information over and above what has been given earlier. 


a eRe eee 
CHAPTER 8 


DATA 
ANALYSIS 


Now that the voluminous data have been collected and processed, 
the next step in the research project is to meaningfully summarise 
all the information that has been collected and recorded. At this 
point, it is very important that you review your analysis design 
for the study. It will give you valuable information on how to 
proceed with the analysis of your data. 


Generally speaking, there are a number of ways in which you can 
summarise and analyse your voluminous data. The more corhmon 
procedures, and which are presented in this Chapter, are the 
following: 


A. Frequency Distributions and related statistics: 
Mean, Mode, Range and Standard Deviation, 
B. Comparing Sub-groups and related statistics: 
Grouped Data cross tabulations: 
Percentages, Chi Square test, Contingency Coefficient 
Ungrouped Data analysis: 
Pearson Correlation Coefficient 


In addition, you can also estimate the corresponding values for 
the universe on the basis of the values obtained for the sample. | 
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This too will be demonstrated in the concluding pages of this 
Chapter. 

Of course, there are a large number of other statistical procedures 
which are useful in analysing research data. You can explore these 
as you gain in skills and experience. But this Primer will introduce 
you to just the most common statistical procedures for budding 
rescarchers. 


FREQUENCIES 


RAW OR UNGROUPED DATA FREQUENCIES 


At the bottom of the ladder, so to say, are the raw data made up 
of all the responses received from all the respondents. If in a 
study, we have 500 respondents and answers to 25 questions, the 


total bits of information with us is 500 * 25 = 12,500. Retaining ~ 


all this and merely staring at these bits of information does not 
tell us anything about anybody. So the first thing to do is to climb 
the first rung of the ladder and ‘summarise by counting the 
frequencies’ and prepare the FREQUENCY distribution. What does 
this mean? Given the large number of responses like male/female, 
24 years old, 25-year old, 26 year old, and so on, we count how 
many are males, how many are females, how many are 24 years 
of age, how many are 25 and so on. 


Let us get the frequency counts for all our responses. Incidentally, 
when you take on this task, you may also get a by-product. You 
are able to detect ‘illegal codes’. Illegal codes are those which 
have not been provided for in the code book. More generally, 
those responses or values, not provided for and having no 
‘legitimate meaning’ or not representing any known response, are 
illegal codes. To give just one simple example. In your study, 
you may have provided for code 1 to represent men and code 2 
to represent women. If by chance, a 3 or a 4 appears in the results, 
you have no way of interpreting this as representing a known 
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response or value. So, these extra codes are illegal codes and have 
to be eliminated. 


The question you may now have is this: How to eliminate the 
illegal codes from the analysis? To answer this question, it would 
be necessary to answer another question: At what stage in the 
study did this illegal code enter? 


To answer the second question first: By its very name you will 
realise that an illegal code can enter the main stream of the study 
only when the data coding takes place. Invariably, data coding 
takes place after the information has been collected and the various 
responses have been given alphabetical (alpha) or numeric code 
identities. When this identity is being given, you will not give a 
code to a non-existent response. For remember that an illegal code 
is One with no corresponding response. 


So the illegal code will enter the mainstream only when the actual 
coding of the data is taking place. The person who is doing the 
coding (the Coder) may, by mistake, give a non-existent code 
number to a response. You may immediately ask: Why should 
the Coder make this mistake? There could be quite a few reasons 
for this but the more common one is that the Coder relies on her 
memory and records a code number that she thinks is appropriate 
to the response. She does not cross check with the Code Book. 


Another point at which the crror can enter is during data entry 
into a computer. The punch operator reads the code number in 
the code sheet and punches the same number into the computer. 
In the process of punching in the code number, she can make a 
number of mistakes. The most common are two in number: 


1. She misreads the code number and so punches the wrong 
code. This error is quite common because one can misread 
numbers. For example, a 1 could look like a 7 or vice versa. 
Similarly, 3 and 8 can be mistaken for cach other. So also 0 


and 6. 
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2. The punch operator has, by mistake, punched the wrong 
number. This can happen when the operator is careless. 


Now that we know the point at which the error could occur, let 

us go back to the first question: How to rectify the error. First, 

compare the ‘computer output’ with the code-sheet. If the error 

is in the latter and not the former, then correct the computer data 
input record. 


If the computer output and the code-sheet show the same code 
number, or if you have not used a computer for data entry but 
have done manual data entry in a code sheet, then compare the 
code sheet entry with the interview schedule response that has 
been recorded. If there is a difference, and the code-shect entry 
is wrong, correct it as well as the computer entry. 


If there is no difference between the code-sheet entry and the 
interview schedule response, then obviously the code is illegal 
only in the sense that it was not in the code book, but somehow 
got recorded in the mind of the Coder. In that case, correct the 
Code Book. 


Having corrected the illegal codes, Ict us get back to the task of 
tabulation. The simplest form of tabulation is the tally technique. 
This aims to find out the frequency distribution of the respondents 
according to their responses to a given question. 


All the legal responses are first listed in the first column in the 
sequence in which the responses appear in the interview schedule 
or the code book. If the question is ‘open ended’, the list of 
responses, along with their respective codes may be recorded in 
the tally sheet as they occur. Alternately, if you can approximately 
guess what responses are most frequently given, you could list 
them in the first column even before you start the tabulation. 


The response to the particular question is sought in each schedule 
and the schedule number is entered in the appropriate row. Finally, 
the number of schedules is totalled up against each response and 
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recorded in the last column. See the example below for 30 
schedules. 


Responses Schedule Number (s/n) Frequency 
Married 1,3,8,12,17,18,19,20,27. 9 
Unmarried 2,4,9,13,21,24,26,28. 8 
Widowed 5.7,10; 35.23.29. 6 
Divorced 6,11,14,16,22,25,30. 7 


The advantage of this procedure is that both the tabulation and 
final tables are evolved simultancously. For, the table on ‘marital 
status’ would consist of the first and third columns, the second 
or middle being deleted. 


A major disadvantage of this technique is that further analysis of 
the data becomes tedious procedure. This is because one has to 
refer to a number of such work tables before a final cross table 
can be prepared. Hence, this procedure is utilised only when you 
are interested in arriving at frequency distributions and ‘one-way 
tables’ or, at the most ‘two-way tables’. 


Incidentally a one-way table is one which shows the frequency 
distribution for one variable. A two-way table presents a cross 
tabulation of two variables. 


We now go on to the second technique of tabulation. This differs 
from the first in just one way. The form of this is given below: 


Marital Education Occupation 

Sch. MUW D %I LM UG G BC WCC WCE NE 
ig / / 

Zz / / / 

3 / / / 

4 / / / 

5 / / / 

6 / / / 
y / / / 

8 / / / 

9 / / / 


—s 
j=) 
— = 
=—S 
™~~ 
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As you will see above each ‘tally sheet’ will contain the data of 
15 to 20 schedules and for 5-10 questions. You will also notice 
that each response/code has its own allotted column. Sub-totals 
of each such tally shect will then be added up to arrive at the 
final frequency distribution. 


The advantage of this procedure is that all information about a 
respondent is available in one row. Further, it renders cross-tabula- 
tion easy for the purpose of preparing two-way and three-way 
tables. Here is an example of how you will prepare a two way 
table to determine the association between education and occupa- 
tion: 


First, prepare a dummy table to receive the information. Next, 
enter the information about each respondent into appropriate boxes 
or cells. Follow the example below which is derived from the 
data above. 


I L M U G 
BC 1.3.9.10 2.4.7 
(s/ns) (s/ns) 
WCC 8 5 
(sn) (sn) 
WCE 6 
(s/n) 


According to this incomplete cross-tabulation, 4 of the respondents 
are illiterate blue collar workers, 3 are literate blue collar workers 
and so on. So the summarisation of this table would be the sum 
in each cell as follows: 


Occupation Education 

[lit. Liter. Matric Under-grad Grad 
Blue Collar 4 (100%) 3 (100%) : - : 
White C Clerical - - - 1 (100%) 1 (50%) 
W.C. Executive - - . : 1 (50%) 
CR SS ee Ee 0 Re 
Total (100%) 4 3 - 1 z 


a es ee 
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We are now able to sce a pattern emerging. The higher the 


education level, the greater the probability that the job would be 
higher. 


A serious disadvantage of the tally sheet is that it takes up a lot 
of space and paper. An even more important disadvantage is that 
if and when you come across ‘new codes’ that you had not 
anticipated and/or provided for when allotting specific columns 
to codes, you would be hard pressed to ‘park’ these ‘newcomers’. 
Hence, a further modification of this procedure is required. 


You will easily recognise in the tally sheet below, the modification 
that has been made to the tally sheet presented above. Yes, the 
modification is to ‘collapse’ all the columns that related to one 
question and so all responses are recorded in the same column as 
you will sce below. 


Sch." MS ED occ 
l M I B 
2 U L B 
3 M I B 
4 U L B 
5 W U Cc 
6 D + E 
y W F B 
8 U U S 
9 U I B 
10 W I B 


The above tally sheet has two distinct features. First, only one 
column is allotted peryquestion and so a very large number of 
questions can be accommodated in the same ‘code shect’. (Earlier 
we called it the tally sheet because we were making tally marks, 
but now we are entcring codes.) The second feature is that we 
have to enter the code number to represent the responses. Hence, 
the totalling of responses becomes a bit tedious. 


The primary advantage is that the cross-tabulation can be done 
with very little effort for the questions and their responses are 
physically close to one another. You would think that there would 
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be a bit of a problem when you want to cross-tabulate questions 
which are on different sheets. This is not so. You can fold the 
two sheets in such a way that the two question columns are next 


to each other. 


You will find that the frequency distribution, the first rung in data 
summarisation and analysis, is a simple, straightforward way of 
‘reducing’ the voluminous data into a form that can reveal some 
meaningful information about the responses and respondents. 


GROUPED DATA FREQUENCIES 


Having prepared frequency distributions, Ict us take just one more 
example. This is with respect to age. 


Let us assume that you have interviewed a cross-section of the 
adult population i.¢., all those over 18 years of age. Glancing 
through the age data you find that the youngest person is 25 ycars 
of age and the oldest is 78 years. The age responses will range 
from 25 to 78, i.e., 54 single year age groups. It is no doubt a 
summarisation of the raw data but it is still unmanageably long. 
Look at the age distribution presented below. 


Age No. Age No. Age No. 
25 1 26 1 28 2 
29 3 30 9 31 7 
a2 7 33 9 34 19 
35 18 36 10 37 11 
38 16 39 8 40 17 
41 9 42 12 43 10 
44 10 45 16 46 5 
47 6 48 7 49 7 
50 15 $i 6 52 13 
53 3 54 8 me ) 7 
56 4 57 1 58 2 
58 2 59 3 60 7 
61 5 62 7 64 1 
65 l 66 3 68 2 
70 ] 71 ] 73 1 
76 l 78 1 
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Such detailed information does not help us to arrive at the profile 
of the sample. The distribution is too long to get a clear idea of 
how many are very young, how many are adults and so on. So 
we must try and further summarise the data to arrive at a gencral 
profile based on some major categorics like the way we have 
conducted the sex characteristics categorics. 


In order to summarise the data, you can use any one of the 
following four procedures: 


— Demographic classification 

— Census Working Age group classification 
— Equal sub-groups classification 

— Statistical procedure. 


We shall consider each of these briefly. ’ 


Demographic: In this procedure, the age data are classified in 10- 
year age groups, say 20 to 29, 30 to 39 and so forth. Using this 
Classification, we get the following summarisation of the above 
given age data: . , 


Age Gp. No. Jo 
25-29 7 4 Mya 
30-39 114 36.1 
40-49 99 31.3 
50-59 62 20.2 
60-69 29 9.5 
70-78 5 1.6 


You will notice that the two extremities have very small numbers 
of respondents. In order to ‘improve’ the distribution you may 
decide to expand the two extreme groups so that the first now 
becomes upto 39 (121 respondents or 38.3 per cent), and at the 
other end, you have age 60 and more (with 34 respondents or 
11.1 per cent). 

Census: The census classifies the working population into four 


age categories, viz., 14 to 35 years of age (or young workers), 35 
to 54 (middle category), 54 to 64 (old group), and those over 65 
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(the aged). Using this classification procedure, the age data will 
get summarised as follows: 


Group No. To 
25-34 58 18.4 
35-54 207 65.5 
55-64 40 12.6 
65-78 11 55 


You will notice from the above that the highest age group has 
too few respondents compared to the other three. Therefore, you 
have a choice of adding this last group to the previous one so 
that the last is now 55 and more and the number of respondents 
increases to 55 or 16.1 per cent. 


There is only one problem with both the demographic and census 
classifications. While these may be very good procedures when 
dealing with the general population, we may run into difficulties 
while dealing with ‘special groups’ like directors of community 
projects. 


Equal Sub-groups: In this procedure you divide the respondents 
into three equal parts. You allot the ‘youngest’ tone third to the 
‘bottom’ category, the next one-third of respondents to the ‘middle’ 
category, and the remaining top one-third to the highest category. 
This can be done easily from the above single year age categories 
because the last column has done the ‘homework’ for us. 


Looking back at the single year age distribution, we find that the 
percentage values 33.3 and 66.7 do not appear in the last column. 
This is not surprising because data do not always oblige us with 
exact figures. So, we take the nearest cumulative percentages and 
this is what we get. 


bottom 1/3 ( upto 35.8%) (age group 25 to 38) = 113 (35.8%) 
middle 1/3 ( upto 67.4%) (age group 39 to 48) = 100 (31.6%) 
top 1/3 (all 100%) (age group 49 +) = 103 (32.6%) 


Now, we have almost three equal groups. But the question that you 
may ask is this: What is the ‘meaning’ that can be given to this 
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peculiar division of age as below 39, above 48 and those in 
between? It is neither bio-social, nor medical or even demographic 
Or census oriented? What is the use of this purely mathematical 
division? On the face of it, there seems to be no rational answer 
to these questions. However, it is relevant because we may, just 
may, come across a situation where we want to compare the three 
groups and so start by having them as an equal number. That is 
to say we may predetermine to have an equal number in each 
category. 


Statistical: This is a rather involved procedure which will be 
explained shortly. But at this point, it is worth mentioning that 
much of statistical analysis is dependent on the initial statistics 
that are used, and this procedure, therefore, recommends itself 
because it too uses the same basic statistics. These are the MEAN 
and the STANDARD DEVIATION. As I already stated just now, we 
shall discuss the procedures for computing these two statistics 
later. Right now, we shall assume the valucs as we go along with 
a discussion of what the statistical classification procedure is all 
about. 


First compute the mean and standard deviation of the distribution. 
For our data the values are 44.2 and 9.75, respectively. The second 
step is to subtract the standard deviation from the mean. Thirdly, 
add the mean and the standard deviation. What do we get? 
Rounding off the decimal values in the results, the lower value 
is 34 years and the upper value is 54 years. In a normal distribu- 
tion, we can expect about 68 per cent of the respondents to be 
within this age group of 34 and 54 years. That is to say 68 per 
cent would normally be within one standard deviation (plus and 
minus) of the mean. But in our example, we find that 72 per cent 
are in it. This is mainly because the single largest age group was 
34 years with the single largest number of 19 respondents falls 
in this category. At the two ends, we.have 12 per cent of the 
respondents who can be considered as the young respondents 
(more than once standard deviation below the mean, and in the age 
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group 25 to 33 years). And another 16 per cent form the old age 
group (more than one standard deviation above the mean value, 


and 55 years to 78 years of age). 


The groups now are as follows: 


Age group No. Per cent 

FBO 
15 to 33 (young) 38 12 

34 to 54 (middlc) 228 72 

55 to 78 (old) 50 16 

Total 316 100 


ee 


Let us now try and understand this. These results show that there 
is a clear cut differential between the young and the old. So for all 
future analysis, we know that statistically speaking, at Ieast we have 
two totally divergent groups — the young at one extreme and the 
old at the other extreme. In fact, this is quite a reasonable and safe 
statistical criterion or procedure for differentiating “extreme groups’ 
for comparison purposes. It docs not depend on your feclings of 
how to divide. While you can talk of a one third each, another person 
can recommend a 25, 50, 25 split to get the three groups. A third 
person may recommend 17, 66, 17 break up,. or even a 10, 80, 10 
or any other criterion. All these are uscful and interesting but there 
is no real criterion external to one’s own feclings of what can be 
done. So, when in doubt the statistical procedure would be a good 
one to adopt for quantitative variables. 


Now that you know about the different procedures for classifying 
data, and have been introduced to two statistical terms, viz. mcan 
and standard deviation, let us move on to the statistical com- 
ponents of data analysis. 


MEASURES OF CENTRAL TENDENCY 


MEAN 


How do we compute the mcan? The mean or the average value 
of numbers is obtained by adding the concemed numbers whose 
average Or mcan value you wish to compute and dividing the total 
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So obtained by the number of numbers that were added up in the 
first place. So if we have five numbers, say 1, 3:5, 7, O-then 
these total up to 25 which when divided by five (the number of 
number added up), we get the result of FIVE which is the mean 
or the average. 


Though the method of calculating the average as given above is 
simple for a small number of cases or respondents, the work 
becomes tedious when a large to very large number is to be 
handled as in the example of the age distribution above. So we 
have to do our calculation step by step. First, get the grouped data 
frequency distribution. Then, follow the steps given in the working 
table below. 


Class interval Mid Point of Frequency Product of 
the class frequency and mid 

interval point of class 
interval 

(1) (2) (3) (4) = (2*3) 
25-34 29.5 58 1711.0 
35-54 44.5 207 9211.5 
55-64 59.5 40 2380.0 
65-78 71.5 11 786.5 
Total 316 14089.0 


Mean age = 14089/316 = 44.6 years of age 


The average age that you would get if you calculated on the basis 
of the raw data is 44.2 ycars. The most accurate average age will 
be the one bascd on the raw data i.e., 44.2 years, and the one 
calculated from the grouped data is an approximation. If the 
number of pcople is large and the number of class intervals is 
also not small, then the difference between the two valucs will 
also not be large. The more the number of class intervals, the 
lesser will be the difference. 
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There is a simpler way of calculating the mean value. But it 
involves a couple of additional steps. If you have a simple 
calculator at your disposal, then you can easily follow the proce- 
dure reported above. If you have access to a computer, you have 
nothing to worry about beyond ‘feeding in’ the command for 
computing not only the mean but also’a host of other values. But 
if you have no access to cither of these and not even the logarithm 
tables, then you just get hold of a statistics textbook which will 
guide you in using the simplified procedure. 


Sometimes the data are so grouped that either the first class interval 
or the last one or both are left open, i.e., cither the lowest value of 
the first class interval or the highest value of the last class interval 
or both are not stated. For example, in the following frequency 
distribution, the lowest and the highest value are not stated: 


Age Groups 
upto 34 

35 - 54 

55 - 64 

65 and above. 


The question here is this: What would be the mid-point for the 
first and last categorics? The rule-of-thumb is to add the number 
of values in the previous or the next class interval to the mid-point 
of that class. In the above example, the mid-point of 35-54 is 
44.5. From this, subtract 20 and you get 24.5 as the mid point 
for the first class interval. Similarly the mid-point for the last 
category would be 70. We can then compute the mean, based on 
these revised mid-points. 


MP Freq Sum 

25 58 1450 
45 207 9315 
60 40 2400 
70 1 770 
total 13935 


Mean age = 44.1 years. 
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This result is somewhat different from the earlicr ones. So what 


is the average age of the group of 316 persons. The answer is 
44.2 years. 


More generally, you have quite some choice in deciding what the , 
class interval for your quantitative data should be. You will realise 
by now that the mean of a qualitative variable means very little. 
This is, in fact, true of some quantitative variables as well, €.2., 
the average family size. The mean, especially the decimal part of 
it, would be notional. But then we need these notional as well 
when undertaking extensive statistical analysis. 


The factors which one should take into consideration in forming 
Class intervals are: 


(i) the total range of variation of the variable 
(ii) the total number of observations 

(iii) the object of analysis 
(iv) the accuracy of the variable studied. 


It would be preferable for othcr computational purposes to have 
equal class intervals, but if this results in skewed-cend groups 
especially, it may be preferable to have unequal class intervals. 
Again, the lower limit of the first class interval need not be equal 
to the lowest value in the table, but can be any number less than 
or equal to it. Similarly, the upper limit of the last class interval 
necd not be the highest value in the table but can be any number 
more than or equal to it. Some adjustments of this sort may be 
necessary to have well-defined equal class intervals. 


MODE 


The second measure of the central tendency that is commonly 
used is the MODE. In the series 1 2 3 3 3 4 4 5, the value that 
occurs most often is 3. This is the mode of the scries. If you 
re-examine the original frequency distribution of the age data, you 
will find that the modal age is 34. 
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When there is class interval as in the case of age distribution, the 
calculation of mode is more involved and therefore not given here. 


COMPARING MEAN AND MODE 


Usually, the mean is a better measure of central tendency than 
the mode. It has a distinct advantage over the other from the 
theoretical point of view, particularly because it can be easily 
calculated and is capable of algebraic manipulation. But if some 
of the items are extremely small or large, the mean may be 
defective as a measure of central tendency. Consider the following 
example: 


30. 20, 21; 217 2G, 21, 20750, Dea. 


Most of the values are in the neighbourhood of 20. Therefore, 
measures of central tendency should be in the neighbourhood of 
20. 


Mean = 80 Mode = 20 


Hence, the mode is a better measure of central tendency than the 
mean in this case. The mode is hardly affected by extreme values. 
Therefore, it should be preferred in such cases. 


The decision as to what type of measure should be calculated 
depends to a great extent on the purpose of calculation. Since the 
two measures embody different concepts, it may sometimes be 
advisable to use both the measures. The mean can be either greater 
or lesser than the mode. 


The mean and mode give us an idea of the typical common value 
of the distribution. But we do not use both these all the time. For 
instance when we refer to qualitative variables (you will recall 
that the sex variable is one such variable), we use the mode, the 
response category that occurs most often in the distribution. You 
will remember the modal response category was the male in the 
sex distribution. But an important point to note here is that the 
mode does not take into account other categories of responses. 
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One might say, it ignores the other categories. In our example, it 
is the female. So this is its one major limitation. 


When we refer to quantitative variables, we use the mean. For 
example, the mean age of our sample is 44.2 years. You can 
compute the mode as well. In our example, it is 34. Why this 
difference? First, it is 34 because that is the age category which 
has the single largest number of respondents (19). All other 
categories have less than this number. It will be noticed again 
that the mode of 34 has totally ignored the remaining respondents 
or more correctly ignored all other age groups. They are not even 
in the running, just because these other categories account for less 
than 19 respondents. This means the winner takes all. Not very 
fair is it? 


The question we still have to ask is this. How is the mean uscful 
in arriving at the response groups, we were earlicr concerned with 
as the fourth option? To get this, we have to just get one more 
value called the standard deviation which tells us what is the 
spread of the ‘values’. 


MEASURES OF DISPERSION 


Consider the following sets of data: 
peel bisa oe mean = 6 
Set 2 13 78 to mean = 6 


In both sets, the mcan is the same. But the number of items is 
different. In the first set, there are three numbers below 6, but in 
the second sect only 2 are below 6. Is there a real difference 
between the two scts? To answer this, we have two major 
measures. These are the range and the standard deviation. 


RANGE 


This is the simplest measure of dispersion. It is the difference 
between the largest and smallest value in the scries. 


For the first sci, the range is 12-1 = 11. 
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For the second set, the range will be 11-1 = 10. 


STANDARD DEVIATION 
This can be calculated through the following steps: 


1 Calculate the difference between mean and each item. 

2 Square each of the differences and find the sum of the squares. 

3 Divide the sum of the squares by the number of items. 

4 Find the square root of the quotient to get the standard 
deviation. 


Let us try this out with the two sets. 

Set 1. [6 — 1=5] [6 — 3=3] [6 — S=1] [7 — 6=1] [9 — 6=3] [11 — 6=5] 
squaring 25+9+1+1+9+25= 144 

Set 2. [6-1] [6-3] [7-6] [9-6] [10— 6} 

squaring 25+9+1+9+ 16=60 


Steps 3 and 4: setl 144/6 = 24. Sq. = 4.9 
set2 60/5 = 12. Sq.rt = 3.5 


So you sce quite a difference between the two sets. The question 
then is: What do we make of these differences when the mean is 
the same? A first reaction could be that the spread of ‘responses’ 
is less for the second set. In other words, it tends to greater 
homogencity than the first set of data. But you can immediately 
object and say that there are only five items in the second set 
compared to six in the first. So will this not affect the results. 
Not really, because both calculations have neutralised this dif- 
ference by working with the MEAN value. 


You can compute one more value which brings together both these 
measures, the mean and the standard deviation. That is the 
co-efficient of variation (CV) and is easy to compute. 

SD * 100/Mean. 


The higher the CV, the greater the heterogeneity. These are the 
results for the two scts. 
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Set1 4.9 * 100/6 = 81.7 
Set2 3.5 * 100/6= 58.3 


So, there is absolutely no doubt that the second set is less 
heterogeneous than the first. 


What do you do to compute the standard deviation when you have 
a large number of items to work on? The following example will 
help you to recall that you have done this in a similar manner 
earlier as well. 


Class Int Mid pt* Frequency Square of (2) (2)* (3) (3)* (4) © 


(1) (2) (3) (4) (5) (6) 
25 - 34 30 58 900 1740 52200 
35 - 54 45 207 2025 9315.2) “419175 
55 - 64 60 40 3600 2400 144000 
65 - 78 72 11 5184 792 57024 
Total 316 14247. 672399 


* for convenience of computation rounded off. 


672399 _ (14247)? 
316 (316) 


= 2127.84 — 2032.69 = 95.15 


Therefore standard deviation = 9.75 


Square of standard deviation = 


There are other short cuts available to manually compute the 
standard deviation for large numbers of respondents. But I would 
suggest that you use a calculator or a computer. 


COMPARING SUB-GROUPS 


CROSS TABLES 


The simplest technique to ascertain whether or not there could be 
a relationship between two variables, is to build up the cross table. 
You already know how this done and will recall the example of 
education and occupation. Since that was based on just 10 respon- 


dents and was mainly intended to show how a cross- tabulation is 
COM nA mt CELL 


KS inital 
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done, Iet us take a couple of concrete examples from the UHD 
study. 


Sex by Time spent on work 


hardly some lots Total 
i 
Male N 14 91 146 251 
% 5.6 36.3 58.2 
Female N 4 13 48 65 
J 6.2 20.0 73.8 
All N 18 104 194 316 
% 5.4 32.9 61.4 100.0 


From the above table, we can sce that more men than women are 
in each of the three categories of ‘time spent in work’. This is 
because there ARE MORE MEN THAN WOMEN in the study. So 
comparing mere numbers has no meaning. 


In order to make some sense of the tendencies, we then have to 
neutralise the different numbers to make them comparable. Hence, 
the percentages. The first question here is how do we compute a 
percentage. The second question is in which direction do we 
compute the percentages. The third question is how do we compare 
after working out the percentage. 


Let us take up one question at a time. First, how do we calculate 
a percentage? Take the example of the males. You have three 
figures there. 14, 91, 146. These add up to 251. 


So, 14 * 100/251 = 5.6 
91% “MO/Z51T = 36.3 
146 * 100/251 = 58.2 


Each of the cell valucs is multiplied by 100/251 =.398 and the 
cell percentage is obtained. When you have a serics of numbers 
to be computed as percentages of a given total, it is most con- 
venient to first get the value of 100/N (total number), and multiply 
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this with each of the individual values to get their respective 
percentages to the grand total. 


Now to the second question: In which direction do we calculate 
the percentage in a cross table? The answer is in two steps. First, 
determine which is the independent variable and which is the 
dependent variable. Second, compute the percentage distribution 
of the dependent variable categories against EACH of the INDE- 
PENDENT variable categorics. To take the above example, sex is 
the independent variable and time spent on work is the dependent 
variable. So there are three categories of the dependent variable, 
and two categories of the independent variable. In the second step, 
you take the first of the two independent variables (males in this 
case) and compute the percentage value of cach of the three 
dependent categories using the total of the independent variable 
category as the total. So, we get males who hardly work (14 of 
them), we get males who work sometimes (91 of them) and the 
rest worked a lot. Hence, calculate the PERCENTAGE ALONG EACH 
OF THE INDEPENDENT VARIABLE CATEGORY. 


We move to the third question. How do we compare the percent- 
ages? You will notice that you calculate the percentage 
horizontally for cach of the categories of the independent variable. 
So you compare vertically downwards i.e., you compare what 
percentage of males and of females are in the first category of 
‘hardly work’. The answer is about 5 per cent and 6 per cent, 
respectively. Similarly, you compare 36 males and 20 per cent 
females who are in the ‘sometime’ group. We find here that a 
very much higher per cent of men compared to women directors 
are in the ‘middle group’. Finally, 58 per cent and 78 per cent of 
men and women, respectively are in the full-time category. So, 
look at the whole thing again in a different way. The percentage 
of men and women increases as one moves from low to high 
levels of work. But comparing the men and women, we find that 
women have a higher ‘rate of movement’ than men. So the trend 
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seems to be that women are more likely than men to spend more 
time on social work compared to ‘non-social work related 
activity’. 

But when we take all three together, as seen in the bottom line 
of the table, the men are ‘negative’ or lower at the extremities 
but in the positive in the middle. Therefore we would say no clear 
trend emerges, though we know for a fact that they are better 
trained. 


Take a second example, again comparing men and women. The 
table is given below. 


Sex by training for social work 


No S-T PG Total 

Male 1 43 103 105 251 
% 17.1 41.0 41.8 

Female 2 14 28 23 65 
% 2i3 43.1 35.4 

All N 57 131 128 316 

% 18.0 41.5 40.5 100.0 


We find from the above that more men than women have gone 
through a full-time, post-graduate programme in social work 
(compare about 42 per cent men with 35 per cent women). At the 
other end, we have the reverse position as more women than men 
have had no training whatsoever (compare 17 per cent with 22 
per cent women who have had no training). So we can say that 
men are more likely than women to have had the benefit of 
training. 


More gencrally, when you look at the bottom line after the table 
(M per cent-F per cent), you find that the value increases as one 
moves from left to right. Thus, one does sce a trend of men 
improving their positions compared to women. 
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In the final analysis, the cross-tabulations do help us to see whether 
a trend emerges from our data or if there is at least a tendency. 
The second table confirmed a trend and the first a tendency. The 
question then is this: Is there some way that we can further 
summarise the data and arrive at a single number or just a few 
numbers to decide whether or not there is a real pattern that 
emerges from the analysis. This question takes us to the first of 
many statistical analyses. 


CHI-SQUARE TEST 


The role of the Chi-square Test is to help us discern whether or 
not the differences that we sce through a visual comparison of 
the figures for the sub-groups of the independent variable are 
REAL differences or are the differences only obtained by chance 
i.e., due to errors or variations in the selection of respondents for 
the study (the sampling variations). 


First, what do we mean by REAL differences? The simple answer 
is that if from the same population you take a hundred samples, 
what are the chances that the observed difference would occur at 
least 95 times or 99 times or even 99.9 times. If the occurrence 
of the differences is less than the pre-designated level (/.e., 90 or 
95 or 99 or 99.9) then we will say that the difference is not a 
real difference but is due to chance factors. Usually we designate 
95 per cent (or more often stated as .05 level) as the pre-designated 
level. If the occurrence is Iess than 5 times (.05) in a 100 samples, 
we say the difference is a real difference. 


To test this we have to first get the cross-table. We will use the 
first of the two given above. 


Sex by time spent on work 


hardly some lots Total 
Male N 14 91 146 251 
Female N 4 13 48 65 
I a ia a Nf tn en ree 
All N 18 104 194 316 


EELS NEAR ao NS Rn POET OO eee SAEED 
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Next we have to calculate for each of the six cells the expected 
valucs. This means we have to compute the value that would be 
obtained if there was no real difference among the sub-groups in 
each cell. In order to compute the expected cell values, we retain 
the marginal values as given. So the expected value will differ 
from one cell to the other because it is computed on the basis of 
its corresponding marginal value. This is best understood through 
the actual computing of the cell-expected values. 


To compute the Chi-square value, we have to compute five sets 
of data in sequence. It is best that we first set up these steps in 
a tabular form so that the computation becomes easy. It is impor- 
tant to note that these steps are repeated for each of the cells in 
the cross-table. In our example, we have 6 cells and so we have 
to compute values 6 times before we can arrive at the summary 
value called Chi-square. 


Each cell will be identified by its two characteristics as the 
following example shows. Ccll m1 is males who hardly spend 


time doing social work. 
° 


Cell m2 is males who spend sometime doing social work. 
Ccll m3 is males who spend all their time doing social work. 
Now, we can sect up the tabular sequence of activitics. 


The second column is the observed cell frequency or value. We 
enter those frequencies below under fo. 


The third column is the value or frequency that would be expected 
if there are no real differences among the sub-groups of respon- 
dents. The fe valucs are computed as follows: 


ae is 1 6 ee = 14.30 
Ties AUR Bay eee a ae 82.61 
ms. 3 19h. ©" Zola, 316 = 154.09 
fl SARS. * G8." Op S10 = 3.70 
f2 . aS FOS” BAe oro = 21.39 
{3 194 #4 65.2) fon Fdg = 39.91 
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If you added up the fe values for males, you must get the same 
total as the original 251 (plus or minus a couple of decimal values 
because of the rounding off at two decimal places). Similarly, 
check out for the females. If either or both totals do not tally with 
the original values, then recheck your computations. 


Now enter these values under fe. 


fo fe fo—fe (fo-fe)* (fo—fe)/fe” 
ml 14 14.30 40 09 006 
m2 91 82.61 8.39 70.39 852 
m3 146 154.09 — 8.09 65.45 425 
fl 4 3.70 30 09 024 
f2 13 21.39 — 8.39 70.39 3.291 


f3 48 39.91 8.09 - 65.45 1.640 


Note that if you add up the values that have been entered under 
fo—fe, the result will be zero. This is the best test that your 
calculations are correct and you have not made any mistake. 


The next step is to square the fo—fe values and enter them in 
the next column fo — fe (squared) or better written as sqr(fo — fe). 
Enter these values under the relevant column. 


The penultimate step is to divide the sqr(fo —fe) values by their 
corresponding fe values. The result of this computation is entered 
in the last column. We calculate to three decimal places because 
in the ultimate computational work, we deal with a decimal place 
set of figures as you will sec later. 


The final step to obtain the Chi-square value is to add up the 
values in the last column. You will now get the figure of 6.238 


The question now is this: Is this summarised statistic indicative 
of a significant difference between the males and the females in | 
their work pattern? To answer this, we have to do a little more 
computation work. It is quite easy in fact. First, the computation 
and then the logic of it. If you examine the original cross-table, 
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you will find that it has 2 rows and 3 columns to contain the raw 
data or observed values. 
From the 2 rows, deduct one and you are left with 1. 


From the 3 columns deduct one and you are left with 2. 
Now 1 * 2 = 2. This is referred to as the Degrees of Freedom. 


Before I explain this, take one more example. 


No. of rows = 3. So 3-1 = 2. 
No. of cols = 5. So5—1 = 4. 
So degrees of freedom (D.F.) = 4 *2 = 8. 


To explain what these mean let us Start with a simplest of 
examples. Let the hypothetical cross-table have just four cells and 
the values are: 


cl c2 ct 
rl 5 9 14 
12 8 3 11 
rt 13 12 25 


So DF = (2r-1) * (2c-1) =1* 1=1 DF. 


Now keeping the total constant, suppose we changed ricl from 
its present 5 to 6, then rlc2 HAS TO BE 8. It is only then the ct 
will remain 14. If we changed rlcl to 3 then rlc2 will become 
11. In effect, rlc2 has no freedom to be any number ‘it likes’, 
its value is determined by rlcl. 


Now look at r2cl. If we decided that rlcl should be 6, then r2cl 
HAS TO BE 7. Similarly, if r2cl was put as 3 then r2cl can only 
be 10. So again, only rlcl seems to have the freedom to change 
its value whereas the values of the other columns is predetermined 
by what the value will be in rlcl. Therefore, only one of the four 
cells has the freedom to decide its value. This can be any one 
cell. This is what we refer to as the degree of freedom. So, in 
this hypothetical example, the degree of freedom is 1. 


Using the same argument in our UHD study cxample, the degree 
of freedom will be 2. That means you can change any two valucs 
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(keeping the totals constant) AND THE OTHER CELL VALUES WILL 
FALL IN LINE. 


So the degree of freedom for a table is decided by the formula 


DF = (r-1) * (c-1) 
where r = number of rows 
and Cc = number of columns 


Now, to decide whether the Chi-square that we obtained for our 
example implies a significant association between sex and pattern 
of working, we have to take a statistics book, tum to the appendix 
and ‘read’ from a statistical table called the Table of Chi-square. 
Here is an extract from the tables and covering our immediate 
interest: 


D.F/p 10 05 01 
1 2.706 3.841 5.412 
2 4.605 5.991 7.824 
3 6.251 7.815 9.837 


The Chi-square value is 6.238 and the DF is 2. Reading through 
the 2nd row (i.e., 2 DF), we find that it is more than the value 
shown under .10 and also higher than that under .05 but lower 
than that under the .01 Ievel. Since we have decided that all our 
Critical levels shall be .0S we say that the value of 6.238 that we 
obtained is STATISTICALLY SIGNIFICANT at the .05 LEVEL. In other 
words, there is a real difference between the pattern of work of 
men and women. 


CONTINGENCY COEFFICIENT 


This is a measure of correlation between two values, whercas 
Chi-square is a measure of probability of association. Since both 
have the same logic, it is often useful to compute the contingency 
coefficient (C) as it also makes it easier to interpret. The formula 
to convert the Chi-square to the C is the following 


C = SQRRT(Chi-sq/(N+Chi-sq)) 
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Applying the Chi-square value that we got in our UHD example 
we get: 

C = SQRRT(6.238/(316+6.238) = SQRRT(6.238/322.238) = 0.139 
C is always between —1.00 and +1.00 


PEARSON’S COEFFICIENT OF CORRELATION 


The interrelationship between responses to two questions is ascer- 
tained through the computation of the correlation coefficient (r). 
It is invariably computed with raw data and not grouped data. 


This is quite a difficult statistics to compute but easier to interpret 
because it has a value ranging from —1 to +1. For computational 
procedures, refer to a statistics textbook. But again, one can get 
an approximation of the R through the Chi-square using the 
following formula: 


Chi Sq=n* R? or R = SQRRT(Chi Sq/n). 


Let us introduce the Chi-square value we obtained earlier into this 
formula and see the result: 


R = SQRRT(6.238/316) = .141 


So the correlation that we get between sex and pattern of work 
is a low .14. But is this statistically significant? It can be decided 
by referring to an appropriate table in an appropriate statistics 
book. We will find that with a sample size of 316 (its D.F. is 314 
based on the formula DF= (N—2) the R value must be above 
113 to be significant at .05 level. Our result is .141 and so we 
can say that the result obtained is statistically significant at the 
~ 05 level. This is what we got through the Chi-square computation 
as well. 


So, we can conclude that there is a low but real difference between 
the work pattern of male and female directors of social work. We 
say low because the correlation as well as the Chi-square and the 
Contingency values are all nearer the ‘lower end’ of an imaginary 
scale of 0 to 100. But the difference is real because it is not due 
to chance factors but the value you will get in 95 out of 100 
samples from the same population. 
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You now have quite a few ‘statistical tests’ at your disposal and 
as you gather in confidence and skills, you can look up statistics 
books for more and these complex tests. 


BEYOND THE SAMPLE 


From the above procedures you now know how to prepare the 
frequency distribution and extract the mean and standard devia- 
tion. The question now is this: From the information we have 
about the sample, can we derive the corresponding values for the 
population from which the sample has been drawn. The answer 
to this question is in the affirmative. An example here will tell 
you how to make the necessary calculations. 


Let us assume that the percentage of respondents who have 
accepted a programme as being beneficial to them is 54 per cent. 
Let us also assume that the number of respondents is 96. The 
question is: What would be the percentage of the universe that 
would be so favourably inclined? In order to arrive at the answer 
we are providing for a five per cent error in our results. 


E = SQRRT ((.54*.46)/96) = .0S 
2SE = .10 
4: 210 = .64 to .44 


i.e., between 44 per cent and 64 per cent 


In other words if we had interviewed the population we would 
probably have got between 44 and 64 per cent agreeing that the 
programme has been beneficial to them. Obviously, the range 
includes, at the one end, a minority or respondents and at the 
other end a majority. But then the range is so wide because we 
want to have values which would account for 95 per cent of their 
population accounted for. 


CHAPTER 9 


8s 


REPORT 
WRITING 


Report writing is the final stage of research. This is the vital end 
point at which you record all that you did from the beginning to 
the end and also what you should have done but did not do. 


The value of a research study is lost unless it is reported. If a 
study and its findings are not communicated, then the research 
experience and the substantive findings cannot be added to the 
reservoir of knowledge and skills. 


As you know, reporting is a matter of communication. What and 
how you communicate depends on answers to a few questions. 


Q. 1. What should be communicated? 


— the research question and procedure 
— the research experience 

— the findings 

— the conclusions 

— the recommendation. 


Q. 2. To whom should it be communicated? 


— the target group 
— fellow scientist 
— sponsor 
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— administration 
— lay people. 


In this context, what may seem appropriate for scientists would 
be boring and unintelligible for the lay person. In other words, 
ask the following questions about your potential audience: 


(a) What is their level of knowledge and understanding about 
the research problem? 
(b) What do they need to know about the study? 
(c) How best can this information be presented? 
Q. 3. Through what medium will the communication be made? 


comprehensive report 
summary report 
newspaper articles 
radio broadcast etc. 


Q. 4. What effect is intended to be achieved by the communica- 
tion? 


arouse public reaction 

add to scientific knowledge and skills 

get policy makers to make use of the conclusions and 
recommendations. 


Q.5. What language should be used? This depends on the 
audience or target group. 


if scientists and academicians, use precise terms; 

if public, use simple alternative language; 

if policy makers, use non-technical, non-abbreviated 
terminology. 


Now we come to a format of a typical survey research report. 


A. Cover page includes: 


= fife 


— author(s) 
— name and address of agency 
— date (year) of preparation of report. 
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B. (a) Table of contents 

(b) Foreword (if any) 

(c) Preface 

(d) Acknowledgement 

C. Introduction (Part I) 

(a) Genesis of the project and practical considerations leading 
to the study. 

(b) The statement of the research problem including the ob- 
jectives and their clarification and review of relevant 
literature. 

(c) Research Strategy: 

(i) Scope of the study 
— time 
— place 
— population 
— sources of data 
(ii) General strategy of the study 
— method, techniques, tools of data collection 
— sample design 
— organisation of data collection 

(d) Problems of data collection and how these were overcome 

(e) Processing and analysis 

(f) Major limitations of the study. 

(g) Chapterisation: a description of each subsequent chapter. 

D. Body of Report (Part II) 

Findings of the study —no definitive rule of how many 

chapters; this depends on the data and analysis. 

The findings must be treated in a meaningful manner. Findings 

about each topic, should preferably have this format: 

(a) Introductory paragraph to say how this topic is important 
in the fulfillment of the objectives of the study 

(b) What do you anticipate to be the results, i.e., questions, 
hypothesis 

(c) Why you expect the reported outcome, vide (b) 
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(d) Now, present the findings in the form of tables, statistic 
summary, graphic presentation etc. 

(ec) Review the presentation vide (d) and say if your expec- 
tations are fulfilled. If yes, go to the next topic (starting 
with (a)) 

If no, 

(f) Discuss why hunch may not have been supported by 
findings 

(g) Offer new possibilities. If offered repeat steps (a) to (Cc) 
again. 

If you have the data, repeat steps (d) to (g) 

If you do not have data, leave the topic in the second round 

at (c) and go to the next topic. 

E. Conclusions: 

(a) Give summary of Part I and Part II 

(b) Give your conclusions based on the findings 

(c) Give recommendations and substantiate these from your 
conclusions 

(d) Review your study and answer the question: “If I have to 
redo this study how would I go about with it?.” It is a 
critical review of your research plan including the 
methodology. 

F. Appendix: 

(a) Tables, if any 

(b) Tools of data collection 

(c) Bibliography 


Some Examples 


Here are a few more hints on writing the findings. Our concern, 
therefore, is with what goes into the ‘findings’ chapters, how to 
write them and more important, how to interpret the findings. 


The major objective in discussing the findings which are already 
presented in a tabular or pictorial form is to describe and sum- 
marise them, as well as to draw the attention of readers to their 


166 Survey Research For Social Work 


outstanding features, all the time moving from simple to complex 
matters. If necessary, tell the reader how to read and examine the 


data presented in a given table or graph. 

Thus, first draw attention to the observable trend of the data for 
the total group and then dwell, upon the outstanding trends of the 
data for sub-groups. For example, 


We find that the response to the question of whether the respon- 
dents have opted or intend opting for research projects: 


50 per cent have opted or intend to opt 
29 per cent are not opting or not intending 
21 per cent are undecided. 


Reviewing the same data from the view point of the sex of the 
respondent, we find that: 
20 per cent males and 43 per cent females have not 
opted 
27 per cent males and 12 per cent females are undecided. 


You will notice that the sentences are more in the form of points, 
and will need to be elaborated in a formal report. Incidentally, these 
examples are taken from my report on Women and Employment. 


We shall now take up examples for detailed illustrations. 


[Expectations]: Collegiate respondents are more likely than non- 
collegiate to be in the labour force because they are better 
qualified. 


EDUCATION LEVEL AND LABOUR FORCE STATUS 


(in percentage) 


In the Labour Force 


Education Yes No Total 
Non-Collegiate 69 31 319 
Collegiate 66 34 281 


[Outcome]: The above table reveals that there is no statistically 
significant differences between the two categories of respondents 
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i.e@., nO association between education and labour force status 
(Chi-Sq. = 0.667; d.f = 1, P. at 50 = 0.455) 


[Explanation]: Why is there no difference? 


(a) Probably women seek education for various reasons ¢.g., as 
an accomplishment, cultural attainment, pre-requisite for good 
marriage, fill in the waiting period for marriage to find a 
husband among class mates. 

(b) Maybe because of methodological reasons i.é., only educated 
women with minimum educational level were selected for the 
study. 


(Expectation): If a woman goes in for vocational training, it must 
be a clear indication of her intention to work or join the labour 
force (L.F.) 


VOCATIONAL TRAINING AND LABOUR 
FORCE STATUS 


(in percentage) 


In Labour Force 


Vocational Training Yes No Total 
Pabceiras ae ae Te EL ae Ee 
Yes 78 22 249 

No ) 58 42 323 


ee 
[Outcome]: Differences are statistically significant (Chi-sq = 
27.588, d.f = 1; P. at .001 = 10.827). In other words, a significantly 
higher percentage of women who undergo vocational training, are 
in the labour force, and the reverse is true of those who have not 
undergone vocational training. 


Marital Status: Earlier findings revealed that age and marital status 
are highly interrelated. Also, age and labour force status are highly 
correlated. 


[Expectation]: A higher proportion of unmarried women compared 
to married women would be in the labour force, why would fewer 
married women be in L.F. 
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(a) Family responsibilities 

(b) Social norms do not encourage them to work 

(c) Women are supplementary earners and hence stay in L.F of 
temporary duration till marriage 

(d) Particularly when economic condition of her husband is poor. 


MARITAL STATUS AND LABOUR FORCE STATUS 
(in percentage) 
In Labour Force 


Marital Status Yes No Total 
Unmarried 93 7 257 
Married 47 53 331 


[Conclusion]: Expectations fulfilled as data reveal highly statisti- 
cally significant difference between the married and unmarried. 


But, we note from the social viewpoint, as high as 47 per cent 
of married are in the L.F. Why? The probable reasons are: 


(a) no major responsibility at home 
(b) no children especially 6 years or less. 


If so, then 


[Further Expectation]: More married women with children six 
years of age or less will be outside L.F 


MARRIED WITH CHILDREN (6-) AND LABOUR 
FORCE STATUS 


(in percentage) 


In Labour Force 


Children (6-) Yes No Total 
Yes 45 ~ 275 
No 58 42 72 


7: TO eer HEEL ETN RE LE 
[Outcome]: Not statistically significant. 


Let us sort out one more detail i.e., number of children of 6 years 
of age or less. 
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NUMBER OF CHILDREN (6-) AND LABOUR 
FORCE STATUS 


(in percentage) 


In Labour Force 


No. of Children Yes No Total 
None 60 40 72 
1—2 47 53 188 
3+ 40 60 87 


In other words, as the number of children of 6 years or less age 
increases the percentage of women in L.F decreases. Why do these 
40 or even 46 per cent women still work? The study throws no 
light on this but we can expect one of the following reasons: 


(a) some employed before marriage and so continued working 

(b) they had other assistance for domestic work 

(c) belong to low income group and hence need the money 

(d) using leisure time to earn money for continuous consumption 

(e) in spite of household responsibility, other forces may operate 
e.g., unhappiness in family. 


RECOMMENDATIONS 


How do we make recommendations on the solution of the problem 
which has been studied? For example, a study of housewives’ 
levels of awareness, and action taken in regard to ‘cleanliness’ 
reveals the following: 


1. All individuals who were interviewed were generally con- 
scious to the existence of the problems of environmental 
cleanliness and sanitation, the factors responsible for them and 
the consequences of having to live with the problem 

2. Nearly all individuals were able to suggest-solutions for the 
eradication of the problem. The solutions seemed to be initially 
with the public authorities as the number and quality of 
facilities available to the community at large were inadequate. 
The respondents seem to have been under the impression that 
it was. only when these facilities were made available that the 
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individual citizen could play his/her role by making use of 
these facilities 
3. Irrespective of the levels of awareness of problems and their 
solutions, the majority did not seem to act on their knowledge. 
Given these findings, the question is: What can be done to rectify 
the situation? 
[Assume]: that the citizen cannot legitimately wait for the public 
authorities to provide all the facilities to the full extent before she 
will act. 


On the other hand, a conscientious active community may have 
to take the first step to prove, so to say, to the public authorities 
that the facilities are inadequate, and more important that the 
citizens themselves are doing all they can to deal with the problem. 


The role of any organisation which undertakes a campaign in this 
respect would be twofold: 


1. Arouse the community 
2. Mediate with the authorities. 


The emphasis in this study is on citizens and their role, 
recommendations must pertain to the first role, arousing the 
community. 


The recommendation is that: 


1. It is not necessary to educate the public in what is the problem 
and what happens if the problem is neglected. 

2. There needs to be a pragmatic strategy to get the citizens to 
implement what they already know and then educate them 
regarding the means to be used with minimum effort to 
implement the strategy. 


ESPEN i hE IS EN a a BS Sl ar 


APPENDIX 
EXERCISE 
WORK-FLOW CHART FOR A SURVEY RESEARCH 
PROJECT 


Critically review the steps given below and comment on the 
appropriateness of the step (especially in the sequence in which 
it appears in the chart, completeness of information, and so on. 
In the light of your critical comments revise the Work-Flow Chart 
which can be used as a rule-of-thumb guide for beginning re- 
searchers. 


1. A felt need: A felt difficulty in adaption of means to a 
discussed end i.e., manpower planning 


2. Selection of the topic: Analyse what is known, look for the 
gaps and deficiencies, follow clues and suggestions, reading, 
discussions, seminars, based on the reports of previous studies. 
Out of the group of recognized problem situation choices have 
to be made so that they can be dealt with one at a time, on 
the priority basis. 

3. Preliminary work to develop a bibliography of existing litera- * 
ture. Report on various studies, discussions, seminar papers, 
articles both published and unpublished. 


4. Read the bibliography. Take notes, annotation for all the 
sources available, published as well as unpublished, before a. 


reasonable selection can be made. Bibliography development 
work does not stop at any particular stage but may continue 
till the end of the project. It goes on building as the study 
moves forward. More and more references are made as one 
goes on reading the bibliography. Things more pertinent are 
accepted and others rejected 
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5: 


Definition and differentiation of ‘specific aspects’ of the topic 


are classified. They are examined in the light of relevance to 
the topic and further discussed to identify each and every 
detailed components of the subject, and also to get a wider 
perspective in the light of the reading of bibliography. 


Statement of the problem. Being pretty clear about the topic 
and as components, sub-components and their relationship, 
now one is in a better position to point out the problem. First 
state the problem in clear and precise form. At this stage care 
should be taken to avoid overlapping and confusion in using 
terms. Problems should not be too inclusive, undefined or too 
narrowly limited. Select one problem out of the list. Examine 
it. In the light of your hunch that has been troubling you so 
far. State and define the problem (‘To define a problem means 
to fence around it, to separate it by careful distinction from 
like questions found in related situations of need’ — Whitney). 


Identification of sub-areas of the problem stated. After defin- 
ing the problem, efforts should be made to identify the various 


sub-areas of the problem. They again should be clear, precise 
and self explanatory. A long list of the sub-areas should be 
made. Doubts and ambiguities should be further clarified 
through discussions and if necessary by falling back on the 
notes and books referred earlier. Final examination should be 
made to ensure that nothing important is left out. 


Detailed Statement of the objectives. Statement of the objec- 


tives in clear terms. Objectives can be as many as possible 
because at a later stage some may be deleted, modified. The 
probable areas covered by the problem, should not be left out. 


Listing of items of information. Taking every objective one at 


a time, the items which information is required should be 
listed, list should be exhaustive. Sometimes one item may not 
be directly related to the problem but important for some other 
item has sure relation to it. This should be thought well before 
hand. Sometimes few items may be repetitive for two or even 
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more sub-areas, in such cases they need not be repeated again 
and again but reference should be made to the sub-area/s 
where the items appeared. Care must be taken to avoid — 
Overlooking the underlying meanings of items and thus 
unlisted. 


Check the list of items of information. At this stage the items 


Should be checked according to their importance and refer- 
ence. If certain items appear to be interesting but not directly 
related can also be retained for the purpose of collecting 
interesting information. 


Scrutiny of items will depend upon the following factors: 
Ascertain the sources — primary/secondary, availability of in- 
formation on each item listed; authenticity of information, if 
we get it; adequacy of information if it is available and 
authentic. 


Selection of the variables. The immediate next step, after the 


listing of variables, is to select the variables on the basis of 
their usefulness, possibility of association or relationship be- 
tween the variables. 


Classification of variables — Independent or dependent. Also 


to be indicated which variable can be dependent as well as 
independent and in what context. 


Identification of cross tables. This should be done in relation 
to the objectives and the requirements of the study; checking 
the list of cross tables although this should be kept open for 
further modification. 


Explaining the rationale for every table. If essential from what 
point; if interesting, why interesting? What findings or data 
it is likely to bring out? 

Drafting the schedule. Taking care of all necessary informa- 
tion. 


Check format for pre-coded questions. Should be filled in, if 


necessary, to make sure that the formats are correct. 
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18. 
19. 


20. 


21. 


22. 


Eas 


24. 


Survey Research For Social Work 


Typing the schedule for pre-testing. 
Preparation of instructions for the investigations. This should 
be prepared before the pre-testing starts and instructions 


should be verified during pre-testing stage. 


Definition of the population. At this stage the population 
considered for the study must be defined precisely. It should 
be remembered that the significance of the findings of the 
research study will be limited to whatever population is 
sampled. 


Definition of the terms needed. Researchers’ definition or 
meanings of geographical area/agency/individual, etc., used in 
the study. 


Sources of population: (see 20). In determining the sources 
of the population from which the sample units are to be 
selected, the rescarcher must relate his description on the 
population to the purpose of his investigation and establish 
the boundaries, or frame of the population according to the 
characteristics of the sample units to be included and their 
scope. 


Unit of study. What is the unit e.g., if we say ‘agency’ or 
‘individual’ do we have anything in mind i.e., type of agency, 
field of services, beneficiaries served, etc. If it is individual, 
do we have nay criteria like age, sex, income, S.E.S., educa- 
tional training etc. These things should be defined. “The frame 
of enquiry should define the categories of materials or in- 
dividuals to be covered in the investigation and define the 
geographical scope within which that is to be carried out. 
(Therefore the sampling units selected within the frame may 
be the same or they may be groups of such units possessing 
all the characteristic required.) 


Techniques of selecting the units. What techniques is to be 
applicd to draw the sample; rational for decision. Decision 
about provision for substitutes and when substitutes will be 
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26. 


27. 


28. 
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provided. At the stage of taking the decision the researcher 
has to anticipate the difficulties that might occur in locating 
or finding the selected units. 


Size of the sample. Logically speaking, the size of the sample 
should be dependent upon the extent to which the sample is 
representative of the universe or population to be studied; the 
inclusiveness of the sample; the types of group(s) involved; 
the number of categories of data required and the method of 
analysis of data. But over and above that other factors to be 
considered in this connection are: 


Time: the period during which the survey has to be completed 
Money: the fund available for the purpose of the study. 
Personnel: the availability of qualified staff to carry out the 
task. 


Preparation of the list of agencies, households or individuals 
on the basis of criteria decided that is preparing the frame for 


the purpose of sampling. 


Actual drawing of sample (either with the help of statistical 
theory or personal judgement — purposive sample and quota 
sample) 


Listing the name, address and location of these selected units. 


Arrangements for contacting the selected agencies or in- 
dividuals, through lIctters or a circular. 


(a) Generally, in case of agencies a circular or a request letter 
can be addressed in advance, but in case of Houscholds/in- 
dividuals, this may not be possible to send individual 
letters but can be carried by the investigators while going 
for interviewing 

(b) If the study is broad based then news paper publicity can 
be given. 
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30. Pre-testing of schedule. 
(a) Arrangement for pre-testing of the schedule: Who will 


(b) 


31. (a) 


(b) 


pre-test the schedule — investigators will be appointed or 
the other research staff such as research Assistants/Re- 
search officers, etc. If investigators have to be appointed 
then the training of the investigators should take place 
before this. In possible cases it is preferable that the other 
staff (R.As/R.Ds) etc, should also conduct pre-testing. This 
will give them an idea of insight to see the type of 
responses they are getting and on that basis it will help 
them to revise or modify the schedule to be finalised. 
Where the pre-testing of the schedule to be conducted, in 
the area where actual sample is going to be selected or 
outside that. 

Pre-testing the schedule. During the pre-testing the inves- 
tigators, apart from recording the necessary data, should 
record the type of reaction of the respondents to each and 
every question; e.g., whether the respondent found it 
difficult to understand the question due to the ambiguity 
in language; that the respondent was doubtful or hesitant 
to give certain statistical data; respondent was enthusiastic 
in giving reply to certain questions. 


Question. There should be recorded apart from the 
’General Remarks’ that an investigator is supposed to 
record. These remarks will help the researcher to revise 
the schedule in a better way. 

Suggestions to overcome such difficulties should be incor- 
porated in the ‘Instructions to the Investigators’. This 
means the instruction to the investigators should be again 
revised out this stage. 


32. A_report on the pre-tested schedule. However small it may 


DE; 


a report should be prepared on the basis of the pretested 


data. If time does not permit the analysis of each and every 
question, a few critical questions can be selected and analysed. 
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33. Revision of the draft schedule. In the light of this analysis the 
schedule should be revised. 


34. Check for format. In what format the schedule should be 
prepared (for certain questions pre-coded questions should be 
prepared). 


35. Typing and stenciling the schedule in its final form. 


36. Typing out the copies of /nstructions for the Investigators. 


37. A decision regarding appointment/recruitment of the inves- 
tigators should be made. 


38. Considerations before interviewing 


(a) the first thing to be considered before appointing the 
interviewers is how many investigators — this again has 
the implication of time factor. The questions to be decided 
are: 

(i) how many interviews an investigator can conduct in 
a day/or a week. 

(ii) how much time do we have at our disposal to complete 
data collection, how many interviews do we have to 
conduct 

(iii) therefore decide the number of interviewers to be 
appointed to complete that many interviews during 
that stipulated period. 

(b) The second thing to be considered is how the investigators 
will be paid — on a piece rate basis, or a monthly salary? 
This consideration should be based on following things : 

— budget provision (amount for data collection) 

— quality of data will be good (if appointment is 
made on piece rate basis/salary basis) 

— quantity of data (No. of interviews) will be more 
if appointed on a piece rate or salary. 
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39. Appointment and training of investigators 


39. 


40. 


(a) Before appointment, when the investigators come for an 
interview with the Project Director, it should be made 


clear to them that their appointments will be confirmed 
only when they are found satisfactory in a minimum of 


two or more trial interviews. 

(b) Before going for their trial interviews the investigators 
should be oriented thoroughly about their job: how to 
approach the respondents, the purpose of the study; how 
to fill up the schedule — clearly, legibly, not to use vague 
terms (service —in case of occupation respondent was 
quite good in his response — in case of remarks). 

(c) The trial interviews also can be given in the presence of 
supervisors. If they are found satisfactory they can be 
appointed. 


Data Collection starts. For a few interviews for each and 
every investigator, direct supervision in the field is necessary 
for the following reasons. 


(a) The investigators may tend to misinterpret the questions 
when explanations are asked for by the respondents 

(b) They may tend to fill the information wrongly and some- 
time also in the wrong places. 

(c) Direct supervision during the first few interviews may 
‘lessen the number of errors for the rest of the interviews. 
This will also save lot of time on the part of the supervisors . 
in correcting the schedules and referring them back to the 
field for reinterviewing. 


Scrutinising the schedules 


(a) The returned filled in schedules should be thoroughly 
scrutinised by the supervisors: for first few hundred 
schedules cach and every schedule should be scrutinised. 

(b) Once the supervisors are sure of the quality and the 
honesty of the investigators or any particular investigator, 
then they may scrutinize a few questions (those which are 
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doubtful) thoroughly and the rest at a random manner. 
The schedule would then be sent back to the field for 
reinterview or correction. 


Preparation of the code book. 
Checking the code book for its final format. 


Preparation of instruction for the coders. 
If coders have to be appointed, trying out a few schedules 


with the coders. 

Tabulation schedule (If any additional table can be presented, 
or needed; how the open-ended questions will be taken care 
of etc.) data processing. 


Coding work starts. (a) coding, (b) punching and verification 
of data, (c) machine processing data. 


Data analysis. (a) checking and correction of table as and 
when they come out, (b) decoding of the tables, (c) regrouping 
of data or preparation of the final format of the table, (d) per- 
centaging/application of statistical test etc., (¢) making a 
skeleton outline of the project, i.e. chapterisation 
Introduction chapter draft should be ready. 

Body of the report (draft). 

Appendix (preparation of the materials to be included in the 
body). 


Bibliography/Reference etc. 
Revision of the draft report. 


Finalisation of the report. At this stage stenciling work should 
start as and when a chapter is finalized. 


180 Survey Research For Social Work 


Dear Reader 


If, after reading this PRIMER and attempting to revise the Work- 
Flow Chart, you are inclined to undertake a low-cost, small to 
medium sized survey research project, and in this endeavour you 
would like to have my guidance, please feel free to write to me 
at the address below giving some information about yourself 
(name and address, sex and age, education and occupation, nature 
of interest in research, research experience and subject area of 


interest). 


We can then jointly plan and execute, stage by stage and step by 
step, the study you have in mind. As you have probably guessed 
by now, my interest is in the broad area of social welfare which 
includes community organisation and peoples’ actions. Given my 
interest in and commitment to social research, the guidance is 
being offered gratis as a goodwill service to budding social 
researchers, no matter where they live and what they are doing. 


If, in response to this offer, I receive a large number of requests 
we may be able to form a network of social researchers, exchang- 
ing information and experiences, and ways and means of 
individual and collective growth in this fascinating field of social 
research and endeavour to do work that would be of direct, and 
hopefully immediate, use to the community in which we live and 
to those with whom we intcract. 


Yours Sincerely, 


Prof..P. Ramachandran 
P.B. 117234, Chembur 
Bombay 400 071. India. 
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Prof. P. Ramachandran, Director, Institute of Community 
Organisation Research, Bombay, and former Head, Department 
of Research Methodology at the Tata Institute of Social 
Sciences, Bombay, has in the last 37 years undertaken on behalf 
of Central, State and Local Governments, organisations and 
voluntary agencies, many research studies in the area of 
housing, social welfare education and personpower, social 
problems, research methods, community development and 
community issues. Included in his large number of writings, 
published and unpublished, are the following publications: 


Training in Research Methodology in Social Sciences in India, 
Delhi: Indian Council of Social Sciences Research, 1974; 
Missing Values: Alternatives in Data Analysis, Bombay: Tata 
Institute of Social Sciences, 1987; Social Welfare Manpower in 
Greater Bombay, Bombay: Somaiya Publications (P) Ltd., 1977; 
An Attempt at Raising Consciousness, Secunderabad: Andhra 
Pradesh Social Service Society, 1985; ‘‘Caste and 
Consciousness: An Inverted Pyramid?’’, /ndian Journal of 
Social Work, 47: 3, Oct. 1986 (Co-Author); Thrust Towards 
Community Organisation: The Regional Report, Tiruchirapalli: 
Tamilnadu Social Service Society, 1989 (Co-Author); Towards 
Integrated Human Development, New Delhi; Caritas India, 1990 
(Senior Author). 


Prof. Ramachandran is also an Executive Member and 
Consultant to a number of Evaluation Research Committees at 
the National, State and Local levels especially with 
non-governmental organisations. Along with his colleagues in 
ICOR, he is currently engaged in developing a tool to study 
‘History From Bclow’’ with particular reference to Pcoples’ 
Movements in India. 
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This is not a textbook but a PRIMER on Surve 
Research. In a lucid style, and using four easy 


IN, 


to-follow examples, Prof. P. Ramachandra 
explains the major steps in Survey Research which 
are: l’roblem Formulation, Research Strategy. 
Methods of Data Collection, Sample Design. 
Analysis Design, Field Data Collection, Data 
Processing, Data Analysis and Reporting. Not only 
is the work systematic, but it is a ‘shot in the arm’ 
for women as the entire work utilises the female 
gender only. The author’s vast experience as a 
researcher as well as a research teacher has enabled 
him to envisage problems that beginners in research 
normally face. 


